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Abstract. We discuss the definability of finite graphs in first-order logic with two 
relation symbols for adjacency and equality of vertices. The logical depth D(G) 
of a graph G is equal to the minimum quantifier depth of a sentence defining G 
up to isomorphism. The logical width W(G) is the minimum number of variables 
occurring in such a sentence. The logical length L(G) is the length of a shortest 
defining sentence. We survey known estimates for these graph parameters and dis- 
cuss their relations to other topics (such as the efficiency of the Weisfeiler-Lehman 
algorithm in isomorphism testing, the evolution of a random graph, quantitative 
characteristics of the zero-one law, or the contribution of Frank Ramsey to the 
research on Hilbcrt's Entscheidungsproblem) . Also, we trace the behavior of the 
descriptive complexity of a graph as the logic becomes more restrictive (for exam- 
ple, only definitions with a bounded number of variables or quantifier alternations 
are allowed) or more expressible (after powering with counting quantifiers). 
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I. Introduction 

1.1. Basic notions and examples. We consider the first-order language of graph 
theory whose vocabulary contains two relation symbols ~ and =, respectively for 
adjacency and equality of vertices. The term first-order imposes the condition that 
the variables represent vertices and hence the quantifiers apply to vertices only. 
Without quantification over sets of vertices, we are unable to express by a single 
formula some basic properties of graphs, such as being bipartite, being connected, 
etc. (see, e.g., [721 Theorems 2.4.1 and 2.4.2]). However, first-order logic is powerful 
enough to define any individual graph. How succinctly this can be done is the 
subject of this article. 

As a starting example, let us say in the first-order language that vertices x and y 
are at distance at most n from one another. A possible formula A n (x,y) can look 
as follows: 

= f x~y\j x = y, 

n-2 

A n (x,y) = Bzi . . . 3z„_i(Ai(x, z x ) A f\ Ai(z i; z i+1 ) A Ai(z n -i,y)j. (1) 

i=i 

By a sentence we mean a first-order formula where every variable is bound by a 
quantifier. If we specify a graph G, a sentence $ is either true or false on it. If H is 
a graph isomorphic to G, then $ is either true or false on G and H simultaneously. 
In other words, first-order logic cannot distinguish between isomorphic graphs. In 
general, we say that a sentence $ distinguishes a graph G from another graph H if 
$ is true on G but false on H. 

For example, sentence VWy Ai(x, y) distinguishes a complete graph K n from any 
graph H that is not complete. The sentence WxWy A n _i(x, y) distinguishes P n , the 
path with n vertices, from any longer path P m , m > n. 

Throughout this survey we consider only graphs whose vertex set is finite and 
non-empty. We say that a sentence $ defines a graph G (up to isomorphism) if $ 
distinguishes G from every non- isomorphic graph H. 

For example, the single- vertex graph P\ is defined by sentence WxWy (x — y). If 
n > 2, then the path P n is defined by 

VxWyAn^x, y) A -*\/xVyA n _ 2 (x, y) 

to say that the diameter equals n — 1 



A 



Vx^3i/i3i/23?/3 I f\ x ~ yt A f\ -n(yi = J 

\i=l,2,3 ijtj ) 



+0 

(2) 



to say that the maximum degree < 2 

A 3x^3y t 3y 2 ( /\ x ~ y t A ^{y x = y 2 ) 



=1,2 



to say that the minimum degree < 1 (thereby 
distinguishing from cycles C 2n -2 an d C*2n-i) 
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We have already mentioned the following basic fact: Every finite graph G is 
definable^ Indeed, let V(G) = {vi, . . . , v n } be the vertex set of G and E(G) be its 
edge set. A sentence defining G could read: 

3x 1 ...3x n (Distinct (x u ... ,x n ) A Adj(xi, ... ,x n ) ) 
A \/x\ . . . Va; n+ i -i Distinct (xi, . . . , x n+ i), 
where, for the notational convenience, we use the following shorthands 
Distinct (xi, . . . , Xk) == A 



[Xi Xj ) , 



i<i<j<fc 



[Xi ~ Xj y 



Adj(xi, . . . , x n ) = f\ Xi ~ Xj A f\ 

{vi,Vj}eE(G) {vi,Vj}<jLE(G) 

In other words, we first specify that there are n distinct vertices, list the adjacencies 
and the non- adjacencies between them, and then state that we cannot find n + 1 
distinct vertices. 

The sentence (J3J) is an exhaustive description of G and seems rather wasteful. We 
want to know if there is a more succinct way of defining a graph on n vertices. The 
following natural succinctness measures of a first-order formula $ are of interest: 

• the length which is the total number of symbols in $ (each variable 
symbol contributes 1); 

• the quantifier depth -D($) which is the maximum length of a chain of nested 
quantifiers in $; 

• the width W(<&) which is the number of variables used in <3> (different occur- 
rences of the same variable are not counted) B 

Formula A n in (pQ) was intentionally written in a non-optimal way. Note that 
L(A n ) = @(n), D(A n ) =n — l, and W(A n ) = n + 1. The same distance restriction 
can be expressed more succinctly with respect to the latter two parameters, namely 

K( X ,V) = Ai(ar,y), 

v) = ^ ( A W2j & *) A A \nm v)), 

where \x] (resp. [^J) stands for the integer nearest to x from above (resp. from 
below). Now D(A' n ) = [log 2 n], giving an exponential gain for the quantifier depth! 
The width can be reduced even more drastically: by recycling variables we can write 
A' n with only 3 variables in total, achieving W(A' n ) = 3. 

We now come to the central concepts of our survey. Let us define L(G) (resp. 
D(G), W(G)) to be the minimum of L($) (resp. D(&), W($)) over all sentences 



1 This fact, though very simple, highlights a fundamental difference between the finite and the 
infinite: There are non-isomorphic countable graphs satisfying precisely the same first-order sen- 
tences (see, e.g., [72l Theorem 3.3.2]). 

2 Gradel [33] defines the width of a formula $ as the maximum number of free variables in a 
subformula of $. Denote this version by W'($). Clearly, W'($) < and the inequality can 

be strict. Nevertheless, the two parameters are closely related: $ can be rewritten by renaming 
bound variables in an equivalent form $' so that W($>') — W'(&); see [321 Lemma 3.1.4]. 
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$ defining a graph G. We will call these graph invariants, respectively, the logical 
length, depth, and width of G. 

Example 1.1. 

1. Using A' n in place of A n in (J2J), we see that D(P n ) < log 2 n + 3 and W(P n ) < 
4. The reader is encouraged to improve the latter to W(P n ) < 3. 

2. The generic defining sentence ([3]) shows that = 0(n 2 ) and D(G) < n+1 
for every graph G on n vertices. 

3. The complement of G, denoted by G, is the graph on the same vertex set 
V(G) whose edges are those pairs that are not in E(G). One can easily prove 
that D(G) = D(G) and W(G) = W(G). 

The logical length, depth, and width of a graph satisfy the following inequalities: 

W{G) < D(G) < L{G). 

The latter relation follows from an obvious fact that D(<&) < L(<&) for any first-order 
formula $. The former follows from a bit less obvious fact that for any first-order 
formula $ there is a logically equivalent formula ^ with VT(^) < D(<&). 

1.2. Variations of logic. 

1.2.1. Fragments. Suppose that we put some restrictions on the structure of a defin- 
ing sentence. This may cause an increase in the resources (length, depth, width) 
that we need in order to define a graph in the straitened circumstances. These effects 
will be one of our main concerns in this survey. We will deal with restrictions of the 
following two sorts. We may be allowed to make only a small (constant) number of 
quantifier alternations or to use only a bounded number of variables. The former is 
commonly used in logic and complexity theory to obtain hierarchical classifications 
of various problems. The latter is in the focus of finite-variable logics (see, e.g, 
Grohe [31]). Moreover, the number of variables has relevance to the computational 
complexity of the graph isomorphism problem, see Section HI 

Bounded number of quantifier alternations. A first-order formula $ with connectives 
{-i, A, V} is in a negation normal form if all negations apply only to relations (one can 
think that we now do not have negation at all but introduce instead two new relation 
symbols, for inequality and non-adjacency). It is well known that this structural 
restriction actually does not make first-order logic weaker: We can always move 
negations in front of relation symbols without increasing the formula's length more 
than twice and without changing the quantifier depth and the width. 

Given such a formula $ and a sequence of nested quantifiers in it, we count the 
number of quantifier alternations, that is, the number of successive pairs V3 and 3V 
in the sequence. The alternation number of $ is the maximum number of quantifier 
alternations over all such sequences. The a-alternation logic consists of all first-order 
formulas in the negation normal form whose alternation number does not exceed a. 
We will adhere to the following notational convention: a subscript a will always 
indicate that at most a quantifier alternations are allowed. For example, D a (G) is 
the minimum quantifier depth of a sentence in the a-alternation logic that defines a 
graph G. 
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For any graph G on n vertices we have 

D{G) <...< D a+1 (G) < D a (G) < ... < D X {G) < D (G) <n + l, 
where the last bound is due to the defining sentence (jSJ). 

Bounded number of variables. The k-variable logic is the fragment of first-order 
logic where only k variable symbols are available, that is, the formula width is 
bounded by k. The restriction of defining sentences to the /c-variable logic will be 
always indicated by a superscript k. To make this notation always applicable, we 
set D (G) = oo if the fc-variable logic is too weak to define G. If k > W(G) for a 
graph G of order n, then we have 

D(G) < D k+ \G) < D k (G) < n k - 1 + k, 

where the last bound will be established in Theorem 14.71 below. Note that the 
bounds in Example 11.11 1 can be strengthened to D 3 (P n ) < log 2 n + 3. 

1.2.2. An extension with counting quantifiers. We will also enrich first-order logic by 
allowing one to use expressions of the type 3 m \l/ in order to say that there are at least 
m vertices with property Those are called counting quantifiers and the extended 
logic will be referred to as counting logic. A counting quantifier 3 m contributes 1 
in the quantifier depth irrespectively of the value of m. For the counting logic we 
will use the "sharp-notation" , thus denoting the logical depth and width of a graph 
G in this logic, respectively, by D#(G) and W#(G). Clearly, D^{G) < D{G) and 
W#(G) < W(G). The counting quantifiers often allow us to define a graph much 
more succinctly. For example, D^(K n ) = W#(K n ) = 2 as this graph is defined by 

\/x\/y (i~j/Vi = i/)A 3 n x (x = x) A -d n+1 x (x = x). 

This is in sharp contrast with the fact that D(K n ) = W(K n ) — n + 1, where the 
lower bound follows from the simple observation that n variables are not enough to 
distinguish between K n and K n+ i. 

1.3. Outline of the survey. Section |5] specifies notation and proves a couple of 
basic facts about first-order sentences. The latter are applied to establish an upper 
bound on the logical length L(G) of a graph in terms of its logical depth D(G) and 
to estimate from above the number of graphs whose logical depth is bounded by a 
given parameter k. The existence of such bounds is more important than the bounds 
themselves that are huge, involving the tower function. Furthermore, we define 
D(G,H) to be the smallest quantifier depth sufficient to distinguish between non- 
isomorphic graphs G and H. We will observe that the obvious inequality D(G, H) < 
D(G) gives the sharp lower bound on D(G). Thus estimating D(G) reduces to 
estimating D(G, H) for all H ¥ G 

The value of D(G, H) is characterized in Section [3] as the length of the Ehrenfeucht 
game on G and H. Moreover, the logical width admits a characterization in terms 
of another parameter of the game. Thus, the determination of the logical depth and 
width of a graph reduces to designing optimal strategies in the Ehrenfeucht game. 

In Section HI the logical width and the logical depth are also characterized, respec- 
tively, as the minimum dimension and the minimum number of rounds such that 
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the so-called Weisfeiler- Lehman algorithm returns the correct answer. The algo- 
rithm tries to decide whether two input graphs are isomorphic; its one-dimensional 
version is just the well-known color-refining procedure. Thus, an analysis of the 
algorithm can give us information on the logical complexity of the input graphs. 
This relationship is even more advantageous in the other direction: Once we prove 
that all graphs in some class C have low logical complexity, we immediately obtain 
an efficient isomorphism test for C . 

This paradigm is successful for graphs with bounded treewidth and planar graphs, 
with good prospects for covering all classes of graphs with an excluded minor. In 
Section loTTI we report strong upper bounds for the logical depth/width of graphs in 
these classes. In Section 15.21 we survey the bounds known in the general case. In 
particular, if a graph G on n vertices has no twins, i.e., no two vertices have the 
same adjacency to the rest of the graph, then D{G) < \n + 3. The factor of \ 
can be improved for graphs with bounded vertex degrees. Here we have to content 
ourselves with linear bounds in view of a linear lower bound by Cai, Fiirer, and 
Immerman [15J. They constructed examples of graphs with maximum degree 3 such 
that W#(G) > cn for a positive constant c. 

Section discusses the logical complexity of a random graph. We obtain rather 
close lower and upper bounds for almost all graphs. Furthermore, we trace the 
behavior of the logical depth in the evolutional random graph model G n ^ p where p 
is a function of n. 

While in Sections [5] and |6] we deal with, respectively, worst case and average 
case bounds, Section [7] is devoted to the best case. More specifically, we define 
succinctness function q(n) to be equal to the minimum of D(G) over all G on n 
vertices. Since only finitely many graphs are definable with a fixed quantifier depth, 
q(n) goes to infinity as n increases. It turns out that its growth is inconceivably slow: 
We show a superrecursive gap between the values of q(n) and n. This phenomenon 
disappears if we "smoothen" q(n) by considering the least monotonic upper bound 
for this function: the smoothed succinctness function is very close to the log-star 
function. Furthermore, the succinctness function can be considered in any logic. Let 
qo(n) be its variant for the logic with no quantifier alternation. We can determine 
qo(n) with rather high precision: It is also related to the log-star function. The 
lower bound for qo{n) implies a superrecursive gap between the graph parameters 
D(G) and D (G), yet another evidence of the weakness of the O-alternation logic. 
The tight upper bound for qo(n) shows that, nevertheless, there are graphs whose 
definitions, even if quantifiers are not allowed to alternate, can have surprisingly low 
quantifier depth. We give several methods of explicit constructions of such graphs. 
These constructions have another interesting aspect. They allow us to show that 
the previously mentioned tower-function bounds from Section [2] cannot be improved 
substantially. 

Some of the most interesting open questions are collected in Section [HJ 

1.4. Other structures. Some of the results presented in the survey generalize to 
relational structures over a fixed vocabulary. Such generalizations are often straight- 
forward. For example, the upper bounds on succinctness functions hold true if the 
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vocabulary contains at least one relation symbol of arity more than 1 (since any 
graph can be trivially represented as a structure over this vocabulary). Extension 
of the worst case bounds to general structures is also possible but requires essential 
additional efforts; see [65] . 

Various definability parameters were investigated also for special structures: col- 
ored graphs (Immerman and Lander [17], Cai, Fiirer, and Immerman [15]), digraphs 
and hypergraphs (Pikhurko, Veith, and Verbitsky |64J), bit strings and ordered trees 
(Spencer and St. John [73]), linear orders (Grohe and Schweikardt [H]). 

2. Preliminaries 

2.1. Notation: Arithmetic and graphs. We define the tower function by 
Tower (0) = 1 and Tower (i) = 2 Tower ( t ~ 1 "> for each subsequent integer i. Given a func- 
tion /, by f( l '(x) we will denote the i-fold composition of /. In particular, f^(x) = 
x. By logn we always mean the logarithm base 2. The "inverse" of the tower func- 
tion, the log-star function log* n, is defined by log* n = min {i : Tower(i) > n}. We 
use the standard asymptotic notation. For example, f(n) = Q(g(n)) means that 
there is a constant c > such that f(n) > cg(n) for all sufficiently large n. 

The number of vertices in a graph G is called the order of G and is denoted by 
v(G). The neighborhood N(v) of a vertex v consists of all vertices adjacent to v. 
The degree of v is defined by degf = |iV(t>)|. The maximum degree of a graph G is 
defined by A(G) = max„ e v(G) degf . 

The distance between vertices u and v in a graph G is defined to be the minimum 
length of a path from u to v and denoted by dist(u,v). If u and v are in different 
connectivity components, then we set dist(u,v) = oo. The eccentricity of a vertex 
v is defined by e(v) = max ug y( G ) dist(v,u). 

Let X C V(G). The subgraph induced by G on X is denoted by G[X]. We denote 
G\X = G[V(G) \ X], which is the result of the removal of all vertices in X from 
G. If a single vertex v is removed, we write G — v = G \ {v}. A set of vertices X is 
called homogeneous if G[X] is a complete or an empty graph. 

A graph is k-connected if it has at least k + 1 vertices and remains connected after 
removal of any k — 1 vertices. 2-connected graphs are also called biconnected. 

A graph is asymmetric if it admits no non-trivial automorphism. 

2.2. A length-depth relation. We have already mentioned the trivial relation 
D(G) < L(G). Now we aim at bounding L(G) from above in terms of D(G). We 
write G =k H to say that graphs G and H cannot be distinguished by any sentence 
with quantifier depth k. As it is easy to see, =k is an equivalence relation. Its 
equivalence classes will be referred to as =k-classes. We say that a sentence $ 
defines a =fc-class a if $ is true on all graphs in a and false on all other graphs. 

Lemma 2.1. 

1. The number of =k-classes is finite and does not exceed Towerik + log* k + 2). 

2. Every =k-class is definable by a sentence $ with -D($) = k and £($) < 
Towerik + log* k + 2). 
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Proof. The case of k — 1 is easy: There is only one =i-class (consisting of all graphs), 
which is definable by \/x(x = x). 

Let k > 2 and < s < k. When we write z, we will mean an s-tuple (z±, . . . , z s ) 
(if s = 0, the sequence is empty). If u G V(G) S and $ is a formula with s free 
variables x 1 , . . . ,x s , then notation G,u \= <&(#) will mean that $(x) is true on G 
with each Xi being assigned the respective Ui as its value. 

A formula $(xi, . . . , x s ) of quantifier depth k — s is normal if $ is built from vari- 
ables Xi, . . . ,Xk and every maximal sequence of nested quantifiers in $ has length 
k — s and quantifies the variables x s+1 , . . . ,x k exactly in this order. A simple in- 
ductive syntactic argument shows that any <&(xi, . . . ,x s ) has an equivalent normal 
formula §'(xi, . . . , x s ) of the same quantifier depth as $. 

We write G,u = fcjS H, v to say that G,u \= <&(#) exactly when if, v |= <&(#) for 
every normal formula $ of quantifier depth k — s. A normal formula defines 
a =fe iS -class a if G, u |= exactly when C7, -u belongs to a. The =fe, s -equivalence 
class of G, u will be denoted by [G, u]k, s - 

Let f(k, s) denote the number of all =fc iS -classes and l(k, s) denote the minimum 
/ such that every =£. iS -class is definable by a normal formula of depth at most k — s 
and length at most I. Note that relations and = fcj0 coincide. Thus, our goal is 
to estimate the numbers f(k, 0) and l{k, 0) from above. 

We use the backward induction on s. A = fe fe -class can be determined by specifying, 
for each pair of the k elements, whether they are equal and, if not, whether they 
are adjacent or non-adjacent. There are at most three choices per pair. It easily 

follows that f(k, k) < 3^) and l(k, k) < 9k 2 . We are now going to estimate f(k, s) 
and l(k, s) in terms of f(k, s + 1) and l(k, s + 1). Suppose that each =fc iS+ i-class j3 
is defined by a formula $g(£i, . . . , x s , x s+ ±) whose length is bounded by l(k, s + 1). 

Define S(G,u) = { [G,u,u] kjS+1 : u G V(G)}, the set of =fc iS+ i-classes obtainable 
from G, u by specifying one extra vertex. Note that 

G,u= k ,sH,v if and only if S(G,u) = S(H,v). 

Indeed, suppose that S(G,u) ^ S(H,v), say, (5 = [G,u,u]k, s +i is not in S(H,v) 
for some u G V(G). Then G,u ^ k s H,v because formula 3x s+ i<!>p is true for G,u 
but false for H, v. Suppose now that G, u and H, v are distinguishable by a normal 
formula of quantifier depth k — s. As it is easily seen, they are distinguishable by 
such a formula of the form Eb s+1 $. Without loss of generality, assume that the 
formula 3x s+ i§ is true for G, u but false for H, v. Let u G V(G) be such that 
G,u,u \= $. Since $ distinguishes G,u,u from all H,v,v with v G V(H), the class 
[G,u,u]k, s +i is not in S(H,v) and, hence, S(G,u) ^ S(H,v). 

Thus, for a =fe jS -class a we can correctly define the set of =fc )S +i-classes accessible 
from a by S(a) = S(G,u) for some (in fact, arbitrary) G,u in a. It follows from 
what we have proved that for arbitrary = fciS -classes a and a', we have 

a — a' if and only if S(a) = S(a'). 

As an immediate consequence, 



f(k,s) < 2 /(M+1) . 
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Since 2- (g) — 2 fc for every integer k > 1, we have /(fc, fc) < 2 2 ^ < Tou>er(log* k + 2). 
By the above recursion, we conclude that f(k,0) < Tower (k + log* k + 2), which 
proves Part 1 of the lemma. 

Another conclusion is that any =& jS -class a can be defined by a normal formulaH 

®a(x)= f\ BX S+1 ^^(X,X S+1 ) A Vx s+ i /\ -i $0 (z, X s+1 ) . 

Looking at the length of $> a (x), we obtain the recurrence 

l(k,s)<f{k,s + l){l(k,s + l) + 9). (5) 

Set g{x) = 2 x (x + 9). A simple inductive argument shows that 

f(k, s) < 29 (fc_s) ( 9fc2 ) and l(k, s) < g {k - s) (9k 2 ). 

Define the two-parameter function Tower(i,x) inductively on i by Tower(0,x) = x 
and Tower(i+l,x) = 2 Tower ^ for i > 0. This is a generalization of the old function: 
Tower(i, 1) = Tower(i). One can prove by induction on i that for any x > 5 and 
i > 1 we have 

g [i) {x) < Tower{i + l,x)/2. (6) 
Indeed, it is easy to check the validity of (151) for % = 1, while for i > 2 we have 

< ^(Tow;er(z,a;)/2) < 2 IWr ^)- 1 = Tow;er(i + 1, x)/2. (7) 

We have for all k > 5 that 9/c 2 < Tower (log* A; + 1). This follows from 9/c 2 < 2 k for 
> 10 and can be checked by hand for 5 < k < 9. Thus, for k > 5, we have by ([6]) 
that 

l(k,0) < g (k) (9k 2 ) < Tower(k+l,9k 2 )/2 < Tower (k+1, 9k 2 ) < Tower(k+\og* k+2). 

Routine calculations (omitted) based on ([5]) and the exact initial values /(2, 2) = 3, 
/(3, 3) = 15, and /(4, 4) = 127 give Part 2 of the lemma for 2 < k < 4. □ 

Lemma 12. 11 2 gives us a bound for the logical length of a graph in terms of its 
logical depth. It suffices to notice that each single graph G constitutes a =fc-class 
for k = D(G). 

Theorem 2.2 (Pikhurko, Spencer, and Verbitsky [61J). 

L[G) < Tower(D(G) + log* D(G) + 2). 

In fact, [nil Theorem 10.1] states only that L{G) < Tower(D(G) + log* D{G) + 
0(1)). Here we went into the trouble of estimating the error term more precisely so 
that Lemma 12 .1[ 2 and some of its consequences can be stated more neatly. 

Lemma [2.1[ 1 gives the following result. 

Theorem 2.3. The number of graphs with logical depth at most k does not exceed 
Towerik + log* k + 2). 

Notice two further consequences of Lemma 12.11 

3 This is a variant of Hintikka's formula, cf. Definition 2.2.5]. 
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Theorem 2.4. 

1. There are at most Tower(k + log* k + 3) pairwise in equivalent sentences about 
graphs of quantifier depth k. 

2. Every sentence $ about graphs of quantifier depth k has an equivalent sen- 
tence <J>' with the same quantifier depth and length less than 3 Tower(k + 
log*fc + 2) 2 . 

Proof. Note that, if a sentence $ has quantifier depth k, then the set of all graphs 
on which $ is true is the union of some =fc-classes. Therefore, there are 2^ and 
no more pairwise inequivalent sentences of quantifier depth k, where f(k) is the 
number of =fc-classes. Part 1 now follows from Lemma [2.11 1. By the same reason 
every sentence <3> of quantifier depth k is equivalent to the disjunction of sentences 
defining some =fc-classes. By Lemma 12.11 2. such disjunction does not need to be 
longer than (f(k) + 3) Tower(k + log* k + 2). This proves Part 2. □ 

2.3. Distinguishability vs. definability. Given two non-isomorphic graphs G 
and H, we define D(G,H) (resp. W(G,H)) to be the minimum of -D(<2>) (resp. 
W(Q>)) over all sentences $ distinguishing G from H. Thus, D(G,H) > k if and 
only if G= k H. Obviously, D(G, H) = D(H,G). Also, D(G, H) < D(G) and 
W(G,H) < W(G). It turns out that these inequalities are tight in the following 
sense. 

Lemma 2.5. 

1. D(G) = vaax H ^aD{G,H). 

2. W{G) =max H pGW(G,H). 

Proof. 1. For each H non-isomorphic to G fix a sentence that distinguishes G 
from H and has the minimum possible quantifier depth, i.e., D($h) — D(G,H). 

Consider the sentence $ == /\ H ^ G &h- It distinguishes G from each non-isomorphic 
H and has quantifier depth max# Therefore, D(G) < max# D(G, H) as 
wanted. An obvious drawback of this argument is that the above conjunction over 
H in $ is actually infinite. However, we have < D(G) and there are only 

finitely many pairwise inequivalent first-order sentences about graphs of bounded 
quantifier depth, see Theorem 12.41 above. Thus we can obtain a legitimate finite 
sentence defining G by removing from $ duplicates up to logical equivalence. 

2. Running the same argument, we have to "prune" the infinite conjunction 
/\ H ^ G ^H, where W($h) = W(G,H). Here we encounter a complication because 
there are infinitely many inequivalent sentences of the same width. (Consider e.g. 
the sentences from Example 11.11 1.) However, Theorem 14.71 1 in Section [4] implies 
that for every H we can additionally require that the depth of is at most, for 
example, n n + n, where n is the order of G. Now we can proceed as in Part 1 of the 
lemma. □ 

Lemma 12.51 stays true in any finite- variable logic, any logic with bounded number 
of quantifier alternations, the logic with counting quantifiers, and any hybrid thereof. 
We set D k (G, H) = oo if k variables do not suffice to distinguish G from H. 
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3. Ehrenfeucht games 

Let G and H be graphs with disjoint vertex sets. The r -round k-pebble Ehrenfeucht 
game on G and if, denoted by Ehr^(G, H), is played by two players, Spoiler and 
Duplicator, to whom we may refer as he and she respectively. The players have 
at their disposal k pairwise distinct pebbles Pi, ■ ■ ■ ,Pk, each given in duplicate. A 
round consists of a move of Spoiler followed by a move of Duplicator. At each move 
Spoiler takes a pebble, say p iy selects one of the graphs G or H, and places pi on a 
vertex of this graph. In response Duplicator should place the other copy of pi on a 
vertex of the other graph. It is allowed to move previously placed pebbles to other 
vertices and place more than one pebble on the same vertex. 

After each round of the game, for 1 < i < k let x\ (resp. t/i) denote the vertex of 
G (resp. H) occupied by pi, irrespectively of who of the players placed the pebble 
on this vertex. If pi is off the board at this moment, Xi and i/i are undefined. If after 
every of r rounds the component-wise correspondence (x±, . . . , Xk) to (yi, . . . , yk) is 
a partial isomorphism from G to H, this is a win for Duplicator. Otherwise the 
winner is Spoiler. The following example should provide the reader with a hint for 
the solution of the exercise suggested in Example 11.11 1. 

Example 3.1. Spoiler wins Ehr^P™, H) if A(H) > 3. Assume that H contains no 
triangle because otherwise Spoiler wins by pebbling its vertices. Let v be a vertex in 
H of degree at least 3. Spoiler pebbles 3 neighbors of v. Duplicator should pebble 
3 distinct pairwise non-adjacent vertices in P n for otherwise she loses the game. 
The distance between any two vertices pebbled in H is equal to 2. Unlike to this, 
some two vertices pebbled in P n (say, by pebbles pi and p 2 ) are at a larger distance. 
Spoiler moves p% to v. Duplicator is forced to violate the adjacency relation. 

The particular case of Ehr^(G, H) in which the number of pebbles is the same 
as the number of rounds, i.e., k = r, deserves a special attention. In this case, the 
outcome of the game will not be affected if we prohibit moving pebbles from one 
vertex to another, that is, if we allow the players to play with each pi exactly once, 
say, in the i-th round. We denote this variant of EHR^(G,if) by ERR r (G,H) and 
will mean it whenever the term Ehrenfeucht game is used with no specification. 

Lemma 3.2. Suppose that in the 3-pebble Ehrenfeucht game on (G, H) some two 
vertices x, y G V(G) at distance n were selected so that their counterparts x', y' G 
V(H) are at a strictly larger distance (possibly infinity). Then Spoiler can win in at 
most [logn] extra moves. 

Proof. Spoiler sets Ui = x, u 2 = y, t>i = x', v 2 = y', and places a pebble on the 
middle vertex u in a shortest path from ui to u 2 (or either of the two middle vertices 
if d(ui,u 2 ) is odd). Let v G V(H) be selected by Duplicator in response to u. By 
the triangle inequality, we have d(u, u m ) < d(v, v m ) for m = 1 or m = 2. For such 
m Spoiler resets u\ = u, u 2 = u m , V\ = v , v 2 = v m and applies the same strategy 
once again. In this way Spoiler ensures that d(ui,u 2 ) < d(vi,v 2 ) in each round. 
Eventually, unless Duplicator loses earlier, d(u\,u 2 ) = 1 while d(vi,v 2 ) > 1, that is, 
Duplicator fails to preserve adjacency. 
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To estimate the number of moves made, notice that initially d(u\,U2) = n and 
for each subsequent Ui,u 2 this distance becomes at most f(d(ui,u 2 )), where f(a) = 
(a + l)/2. Therefore the number of moves does not exceed the minimum i such 
that / (i) (n) < 2. As = 2*/3 - 2* + 1, the latter inequality is equivalent to 

2* > n, which proves the bound. □ 

There is a rather clear connection between Spoiler's strategy designed in the proof 
of Lemma I3T21 and first-order formula A' n (x, y) in (jlj. We will see that, in some strong 
sense, EHR r (G,i?) corresponds to first-order logic, while Ehr^(G, H) corresponds 
to its fc-variable fragment. In fact, every logic has its own corresponding game. 

In the k-alternation variant of EHR r (G, H) Spoiler is allowed to switch from one 
graph to another at most k times during the game, i.e., in at most k rounds he can 
choose the graph other than that in the preceding round. 

In the counting version of the game Ehr*(G, H) Spoiler can make a counting 
move consisting of two acts. First, he specifies a set of vertices A in one of the 
graphs. Duplicator has to respond with a set of vertices B in the other graph so 
that | B | = \A\ (if this is impossible, she immediately loses). Second, Spoiler places 
a pebble Pi on a vertex b G B. In response Duplicator has to place the other copy of 
Pi on a vertex a G A. It is clear that, any round with \A\ = 1 is virtually the same 
as a round of the standard game. 

There is a general analogy between strategies allowing Spoiler to win a game 
on G and H and first-order sentences distinguishing these graphs: the former can 
be converted into the latter and vice versa so that the duration of a game will be 
in correspondence to the quantifier depth and the number of pebbles will be in 
correspondence to the number of variables. 

Theorem 3.3 (The Ehrenfeucht theorem and its variations). Let G and H be non- 
isomorphic graphs. 

1. (Ehrenfeucht [22], Fraisse [Ug) D(G,H) equals the minimum r such that 
Spoiler has a winning strategy in EHR r (G, H). 

2. (Pezzoli |60j) Dk(G, H) equals the minimum r such that Spoiler has a winning 
strategy in the k-alternation game EHR r (G, H). 

3. (Immerman [H], Poizat [66]) W(G,H) equals the minimum k such that 
Spoiler has a winning strategy in Ehr^'(G, H) for some r. 

4. (Immerman [H], Poizat [66]) D h (G,H) equals the minimum r such that 
Spoiler has a winning strategy in Ehr^G, .£/"). 

5. (Immerman and Lander [17]) W#(G,H) equals the minimum k such that 
Spoiler has a winning strategy in the counting version of Ehr^(G, H) for 
some r. Furthermore, if k > W#(G,H), then D^(G,H) equals the mini- 
mum r such that Spoiler has a winning strategy in the counting version of 
EnR k r (G,H). 

We refer the reader to |I45, Theorem 6.10] for the proof of Parts 3-5. Part 1 follows 
from Part 4 in view of the facts that D(G,H) = min^ D k (G, H) and that any 



It was Ehrenfeucht who formally introduced the game. Prior to Ehrenfeucht, Fraisse obtained 
virtually the same result using an equivalent language of partial isomorphisms. 
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sentence $ can be equivalently rewritten with the same quantifier depth -D($) and 
with use of at most -D($) variables. 

In view of Lemma I2.5[ the Ehrenfeucht theorem provides us with a powerful tool 
for estimating the logical depth and width of graphs. Consider, for instance, a path 
P n . Example 13.11 and Lemma [3.21 are immediately translated into the upper bound 
D 3 (P n ) < logn + 3. On the other hand, a lower bound D(P n ) > \ogn — 2 follows 
from the existence of a winning strategy for Duplicator in EHR r (P n , P n +i) whenever 
t < [log nj — 1 (all details can be found in [721 Theorem 2.1.3]). 

4. The Weisfeiler-Lehman algorithm 

Graph Isomorphism is the problem of recognizing if two given graphs are isomor- 
phic. The best known algorithm (Babai, Luks, and Zemlyachenko [9]) takes time 
2 ^ /niogn \ where n denotes the number of vertices in the input graphs. Particu- 
lar classes of graphs for which Graph Isomorphism is solvable more efficiently are 
therefore of considerable interest. Somewhat surprisingly, a number of important 
tractable cases are solvable by a combinatorially simple, uniform approach, namely 
the multidimensional Weisfeiler-Lehman algorithm. The efficiency of this method 
depends much on the logical complexity of input graphs. 

For the history of this approach to the graph isomorphism problem we refer the 
reader to US]- We will abbreviate k- dimensional Weisfeiler-Lehman algorithm 
by k-dim WL. The 1-dim WL is commonly known as canonical labeling or color 
refinement algorithm. It proceeds in rounds; in each round a coloring of the vertices 
of input graphs G and H is defined, which refines the coloring of the previous round. 
The initial coloring C° is uniform, say, C°(u) = 1 for all vertices u G V(G) U V(H). 
In the (i + l)st round, the color C l+1 (u) is defined to be a pair consisting of the 
preceding color C l_1 (w) and the multiset of colors C^iw) for all w adjacent to u. 
For example, C l (u) = C l (v) iff u and v have the same degree. To keep the color 
encoding short, after each round the colors are renamed (we never need more than 
2n color names^l). As the coloring is refined in each round, it stabilizes after at 
most 2n rounds, that is, no further refinement occurs. The algorithm stops once 
this happens. If the multiset of colors of the vertices of G is distinct from the 
multiset of colors of the vertices of H, the algorithms reports that the graphs are 
not isomorphic; otherwise, it declares them to be isomorphic. Disappointingly, the 
output is not always correct. The algorithm may report false positives, for example, 
if both input graphs are regular with the same vertex degree. 

Following the same idea, the fc-dimensional version iteratively refines a coloring 
of V(G) k U V(H) k . The initial coloring of a fc-tuple u is the isomorphism type of 
the subgraph induced by the vertices in u (viewed as a labeled graph where each 
vertex is labeled by the positions in the tuple where it occurs). Loosely speaking, the 
refinement step takes into account the colors of all neighbors of u in the Hamming 
metric. Color stabilization is surely reached in r < 2n k rounds and, thus, the 
algorithm terminates in polynomial time for fixed k. 



We do not need even more than n because appearance of the (n + l)th color indicates non- 
isomorphism. 
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Let us give a careful description of the /c-dim WL for k > 2. Given an ordered 
/c-tuple of vertices u = (u\, . . . , u^) G V(G) k , we define the isomorphism type of u 
to be the pair 

tp(u) = ({(z,j) G [A;] 2 : Ui = Uj},{ G [A;] 2 : K Uj } G £(GQ} ) , (8) 

where [/c] denotes the set {1, . . . , /c}. If w G ^(G) and i < k, we let ■u i,u ' denote the 
result of substituting w in place of Ui in w. 

The r-round k-dim WL takes as an input two graphs G and H and purports to 
decide if G = H. The algorithm performs the following operations with the set 
V(G) k UV(H) k . 

Initial coloring. The algorithm assigns each u G V(G) U V(H) k color C k,0 (u) = 
tp(u) (in a suitable encoding). 

Color refinement step. In the i-th round each u G is assigned color 

C k >\u) = (C k ^-\u), {{ (C^iu 1 '™), . . . , C fc ' i - 1 (« fe ' w )) : w G 

and similarly with each u G V(H) k . 

Here . .]f denotes a multiset. In a weaker count-free version of the algorithm, 
this notation will be interpreted set. Let 

C Kr {G) = {{C fc ' r (M) : u G V(C7) fc }} . 

Computing an output. The algorithm reports that G ^ H \i 

C k > r (G) + C k ' r (H) (9) 

and that G = H otherwise. 

In the above description we skipped an important implementation detail. In order 
to prevent increasing the length of C k,t {u) at the exponential rate, we arrange colors 
of all /c-tuples of V(G) k U V(H) k in the lexicographic order and replace each color 
with its number before every refinement step. 

Furthermore, let 

diagC k ' T (G) = {{C k ' r {u k ) : u G V(G) }} , 
where u k denotes the /c-tuple (u, . . . , u). 

Lemma 4.1. In both the standard and the count-free versions of the k-dim WL, 
inequality 

diagC k ' r \G) + diagC k ' r (H) (10) 
implies (TJ|), which in its turn implies 

diagC k ' r+k -\G) £ diagC k ' r+k -\H). (11) 

Proof. Consider the standard version; the analysis of the count-free case is similar 
(and even simpler). By the equality type of a /c-tuple u we mean the first component 
of (jHJ). Note that /c-tuples with different equality types never have the same color. 
Therefore, C k,r (G) and C k,r (H) are different iff they are different on some class of 
/c-tuples with the same equality type. This proves the first implication. 

On the other hand, suppose that © holds. Let E be an equality type on which 
C k,r (G) and C k,r (H) differ. Note that each u in E contributes color C k,r (u) (a certain 
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number of times) to color C k,r+k ~~ 1 (a k ). Moreover, the sum of the contributions over 
all vertices a is the same for every u G E. It follows that, if a color has different 
multiplicities in C k,r (G) and C k,r (H), its "traces" occur different number of times in 
diag C k,r+k ~ 1 (G) and diag C k ' r+k ~ l (H), and hence these multisets are distinct. □ 

As it is easily seen, if is an isomorphism from G to H, then for all k, i, and u G 
V(G) k we have C k ' l (u) = C k,l (<j)(u)). This shows that for isomorphic input graphs 
the output is always correct. If input graphs are non-isomorphic and the dimension 
k is not big enough, the algorithm can erroneously report isomorphism. A criterion 
for the optimal choice of the dimension is obtained by Cai, Fiirer, and Immerman 
[15J, who discovered a connection between the Weisfeiler-Lehman algorithm and 
the logical complexity of graphs via the Ehrenfeucht game (for the color refinement 
algorithm this was done by Immerman and Lander [H]). The success of the standard 
version of the algorithm depends on distinguishability of the input graphs in the logic 
with counting quantifiers, while the count-free version is in the same way related to 
the standard first-order logic. 

Referring to the /c-dim WL below, we will always assume k > 1 for the standard 
version of the algorithm and k > 2 for its count-free version (we can exclude the 
case of k = 1, whose analysis differs by some details, as the count-free 1-dim WL is 
of no interest: note that it is unable to distinguish between two graphs of order n 
without isolated vertices). 

Given numbers r, /, and k < I, graphs G, H, and fc-tuples u G V{G) k , v G V(H) k , 
we use notation Ehr',(G, u, H, v) to denote the r-round /-pebble Ehrenfeucht game 
on G and H with initial configuration (u,v), that is, the game starts on the board 
with k already pebbled pairs (ui,Vi). If the initial configurations is not a partial 
isomorphism, Duplicator loses EHR z r (G, u, H, v) whatever r > 0. The following 
lemma is a key element of our analysis. 

Lemma 4.2 (Cai, Fiirer, and Immerman |T5J). Let u G V(G) k and v G V(H) k . 

1. Equality 

C k ' r (u) = C Kr (v) (12) 

holds for (the standard version of) the k-dim WL iff Duplicator has a winning 
strategy in the counting version of Ehr^ +1 (G, u, H, v). 

2. Equality [W\! holds for the count-free version of the k-dim WL iff Duplicator 
has a winning strategy in (the standard version of) Ehr^ +1 (G, u, H, v). 

Proof. We prove only Part 2 (Part 1 is proved in detail in [151 Theorem 5.2]). We 
proceed by induction on r. The base case r = is straightforward by the definitions 
of the initial coloring and the game. Assume that the proposition is true for r — 1 
rounds. 

Let Xi and yi denote the vertices in G and H respectively marked by the i-th 
pebble pair. Assume ffl2|) and consider the Ehrenfeucht game on G, H with initial 
configuration (x\, . . . , %k) = u and (yi, . . . , yk) — v. First of all, this configuration is 
non-losing for Duplicator since ffl2|) implies that tp(u) = tp(w). Further, Duplicator 
can survive in the first round. Indeed, assume that Spoiler in this round selects a 
vertex a in one of the graphs, say in G. Then Duplicator selects a vertex b in the other 
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graph H so that C k ' r ~ 1 (u l ' a ) = C k,r ~ 1 (v l,b ) for all i < k. In particular, tp(w l,a ) = 
tp(v l,b ) for all i < k. Along with tp(u) = tp(-y), this implies that tp(u, a) = tp(v, b). 
Assume now that in the second round Spoiler removes j-th pebble, j < k. Then 
Duplicator's task in the rest of the game is essentially to win Ehr^+J (G, w J ' a , if, v^ b ). 
Since C k ' r ~ 1 {u^ a ) = C k ' r ~ 1 (v : > ,a ), Duplicator succeeds by the induction assumption. 

Assume now that ffT2j) is false. It follows that C k ' r ~ 1 (u) ^ C k ' r ~ 1 (v) (then Spoiler 
has a winning strategy by the induction assumption) or there is a vertex a in 
one of the graphs, say in G, such that for every b in the other graph H we have 
C k,r ~ l (u : >' a ) 7^ C k ' r ~ l {v?' b ) for some j = j(b). In the latter case Spoiler in his first 
move places the (k + l)-th pebble on a. Let b be the vertex selected in response 
by Duplicator. In the second move Spoiler will remove the j(b)-th pebble, which 
implies that the players essentially play Ehr^(G, m- 7 ' , if, CP' 6 ) from now on. By 
the induction assumption, Spoiler wins. □ 

Lemma 4.3. Equality diagC k,r (G) = diag C k,r (H) is true for the standard (resp. 
count-free) version of the k-dim WL iff Duplicator has a winning strategy in the 
counting (resp. standard) version of~EnR k +l(G,H). 

Proof. We consider the standard version of the algorithm; the proof for the count- 
free version is very similar. If the multisets diag C k ' r [G) and diag C k ' r (H) are not 
equal, Spoiler has a winning strategy in the counting game Ehr^(G, H). In the 
first round he makes a counting move that forces pebbling a G V(G) and b G V(H) 
so that C k,r (a k ) ^ C k,r (b k ). The remainder of the game is equivalent to the counting 
game Ehr^ +1 (G, a fc , if, where Spoiler has a winning strategy by Lemma [4.21 

If the multisets diag C k ' r (G) and diag C k ' r (H) are equal, Duplicator is able to play 
the first round so that C k,r (a k ) = C k,r {b k ) for the pebbled vertices a and b. She 
wins the remaining game again by Lemma 14.21 □ 

We say that the r-round fc-dim WL works correctly for a graph G if its output 
is correct on all input pairs (G, H) (here H may have any order, not necessary the 
same as G). 

Theorem 4.4. The r-round k-dim WL works correctly for G if 

k > W#{G) - 1 and r> D k # +1 {G) - 1 

and only if 

k > W # (G) - 1 and r> D k +\G) - k. 

The same holds true for the count-free r-round k-dim WL and the standard logic 
(without counting). 

Proof. If G = H, the output is correct in any case. Suppose that G ^ H. By Lemma 
14.11 inequality (|T0|) is a sufficient condition for the output being correct while (TTTj) 
is a necessary condition for this. The theorem now follows from Lemma 14.31 the 
Ehrenfeucht theorem (Theorem 13.31 4.5). and Lemma [2.51 1 along with its counting 
version. □ 

By Theorem 14.41 k > W#(G) — 1 is both a sufficient and a necessary condition for 
a successful work of the k-dim WL on all inputs (G,H). As we already discussed, 
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the number of rounds can be taken r = 0(n k ). Therefore, Graph Isomorphism is 
solvable in polynomial time for any class of graphs C with W#(G) = 0(1) for all 
G G C '. This applies to any class of graphs embeddable into a fixed surface and any 
class of graphs with bounded treewidth (see Section [57TT) . 

Sometimes the Weisfeiler-Lehman algorithm gives us even better result, namely 
the solvability of the isomorphism problem by a parallel algorithm in polylogarithmic 
time. The concept of polylogarithmic parallel time is captured by the complexity 
class NC and its refinements: 

NC = Ui NC* and NC ! C AC* C TC* C NC i+1 , 

where NC 8 consists of functions computable by circuits of polynomial size and depth 
0(log l n), AC 1 is an analog for circuits with unbounded fan-in, and TC 2 is an ex- 
tension of AC* allowing threshold gates. As it is well known jl9], AC* consists of 
exactly those functions computable by a CRCW PRAM with polynomially many 
processors in time 0(log l n). Grohe and Verbitsky [12] point out that the r- round 
fc-dim WL (resp. its count-free version) is implementable in TC 1 (resp. AC 1 ) as long 
as k = 0(1) and r = O(logn). If combined with Theorem I4.4[ this gives us the 
following result. 

Theorem 4.5. Let k > 2 be a constant. 

1. Let C be a class of graphs G with D^(G) = O(logra). Then Graph Isomor- 
phism for C is solvable in TC 1 . 

2. Let C be a class of graphs G with D h (G) = O(logn). Then Graph Isomor- 
phism for C is solvable in AC 1 . 

We will see applications of Theorem 14.51 in Section 15.11 

Suppose that k > W(G) — 1 and that we do not know a priori any bounds for 
D k+1 (G). How large has r to be taken in order to ensure that the r- round fc-dim 
WL works correctly for G? An answer is given by an important concept of color 
stabilization that was already discussed in the beginning of this section. We will 
regard C k,r as a partition of V(G) k U V(H) k . Let R be the minimum number for 
which C k,R = C > R ~ 1 . Of course, it is enough to check the condition (jUJ) for r — R; 
it cannot change for bigger r. Since each C k,r is a refinement of C k ' r ~ 1 , we have 
R < v(G) k + v(H) k . In fact, we are able to prove a bit more delicate claim: The 
Weisfeiler-Lehman algorithm can be terminated as soon as C k,r stabilizes at least 
within V(G) k . 

To make this more precise, we introduce some notation. Denote the restriction of 
the partition C k ' r to V(G) k by Gq . Let Stab k (G) be the smallest number s such 
that C k Q S = C k a s ~\ Note that Stab k (G) is an individual combinatorial parameter of 
a graph G, not depending on H (we may think that the /c-dim WL is run on a single 
graph G, which is actually a quite meaningful canonization mode of the algorithm). 

We now state practical termination rules for the /c-dim WL. 

Rule 1: Once C k ' r (G) ^ C k,r (H), terminate and report non-isomorphism. 
Rule 2: Once r = Stab k (G) and C k < r (G) = C k > r (H), terminate and report 
isomorphism. 
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Let us argue that these rules are sound for both versions of the algorithm. Suppose 
that Rule 2 is invoked. Thus Cj r = C^'" 1 and C k ' r (G) = C k > r (H). By the latter 
equality we also have C k,r ~ 1 (G) = C k,r ~ 1 (H). It follows that in the r-th round 
the algorithm achieves a proper color refinement on neither G nor H. Thus, the 
partition C k,r has been stabilized on V(G) k U V(H) k and the soundness of Rule 2 
follows. 

Theorem 4.6. 

1. The r-round k-dim WL recognizes non-isomorphism of G and H if 

k>W # (G,H)-l and r> Stab k {G). 

2. The r-round k-dim WL works correctly for G if 

k > W # (G) - 1 and r > Stab k (G). 

3. Both claims hold true for the count-free version of the algorithm and the 
standard logic (with no counting). 

We have seen that good bounds for the logical complexity of graphs imply effi- 
ciency of the Weisfeiler-Lehman algorithm on these graphs. Now we will get a couple 
of noteworthy facts on the logical complexity as a consequence of our analysis of the 
algorithm. 

Theorem 4.7. Let G be a graph of order n. 

1. If G is distinguishable from another graph H in the l-variable logic, then 
D e (G,H) < n f ~- 1 +£-2. 

2. If G is definable in the l-variable logic, then D e (G) < n + I — 2. 

Proof. Let k = I — 1. Comparing the sufficient conditions for the correctness of 
the r-round fc-dim WL given by Theorem 14.61 and the necessary conditions given by 
Theorem WM we have D k+1 (G,H) < Stab k {G) + k provided k > W{G,H) - 1 and 
D k+1 (G) < Stab k [G) + k provided k > W(G) — 1. For the former claim we need also 
the fact, actually established in the proof of Theorem l4.4[ that the count-free r-round 
fc-dim WL is able to recognize non-isomorphism of G and H only if k > W(G, H) — 1 
and r > D k+1 (G, H) - k. It remains to notice that Stab k (G) < n k - 1. □ 

A somewhat weaker bound D e (G) < n l + i + 1 follows from the work of Dawar, 
Lindell, and Weinstein [2DI Corollary 4]. 

5. Worst case bounds 

5.1. Classes of graphs. Here we overview known bounds for the logical depth and 
width for natural classes of graphs. Several interesting definability effects can be 
observed even when we focus on so simple graphs as trees. This class is considered at 
the beginning of this section (and will be further discussed in Sections [6] and [7]). We 
will see that many results about trees admit generalization to graphs with bounded 
treewidth. We further consider planar graphs. Then we briefly discuss more general 
cases of graphs embeddable into a fixed surface and graphs with an excluded minor, 
as well ClS db few sporadic results on other classes. 
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5.1.1. Trees. The following result is based on Edmonds' algorithm, that dates back 
to the sixties (see, e.g., [IS]), and its logical interpretation is due to Immerman and 
Lander [27]. 

Theorem 5.1. 

1. The color refinement algorithm succeeds in recognizing isomorphism of trees. 
Consequently, W#(T,T') < 2 for every two non-isomorphic trees T and T' . 

2. W#(T) < 2 for every tree T. 

Proof. 1. As in Section HI let C r denote the coloring appearing after the r-th re- 
finement. Let N r (v) denote the set of all vertices at the distance at most r from a 
vertex v. It is not hard to see that, if v is an arbitrary vertex in a tree T, then the 
subtree spanned by N r (v) is, up to isomorphism, reconstructible from C r (v ). Let v 
and v' be arbitrary vertices in trees T and T' . If T ^ T", we have C r (v) ^ C r (v') at 
latest for r one greater than the smaller of the eccentricities of v and v'. Therefore, 
the color refinement algorithm distinguishes between any two non-isomorphic trees. 
The second statement of Part 1 follows by Lemma [4.21 1 and Theorem 13.31 5. 

2. To obtain the desired definability result, we use the equality W#(T) = 
maxn^T W#(T, H), which is an analog of Lemma [2.51 2 (with a much simpler proof 
as graphs of different orders are distinguishable with a single counting quantifica- 
tion). Thus, it suffices to prove that W#(T, H) < 2 whenever H ^ X. Suppose that 
H is not a tree for otherwise we are done by Part 1. Also, as it was just mentioned, 
we can suppose that both X and H have n vertices. 

Assume first that H has a connected component X" which is a tree. Note that 
T' j£ T because X' has less than n vertices. Let v G V(T) and v' G V(T"). Run 
the color refinement algorithm on input (X, H). As in the proof of Part 1 we have 
C n (v) 7^ C n (v') because the coloring C n on T' is the same as if the algorithm was 
run on T' instead of H. Therefore, X and H are distinguishable with 2 variables in 
the counting logic. 

If none of the connected components of if is a tree, then H has at least n edges. 
Since T has exactly n — 1 edges, H and T have distinct multisets of vertex degrees 
and, hence, are distinguishable by a sentence with 2 counting quantifiers. □ 

The proof of Theorem 15.11 gives us only a linear upper bound D\(T) = 0(n) for 
a tree of order n. We can get a speed-up if we allow more variables. 

Theorem 5.2. For every tree T on n vertices we have 

D%{T) <3 logn. 

Proof. By an analog of Lemma [2.51 1 for the counting logic and Theorem 13.31 5. we 
have to show that Spoiler is able to win the counting game Ehr^ (T, T') with some 
r < 3 logn for any graph X" non-isomorphic to X. Suppose that X' has the same 
order n. If X' is disconnected, Spoiler wins (even without counting moves) by Lemma 
13.21 If X' is connected and has a cycle, then X and X' have distinct multisets of 
vertex degrees. Therefore, we will suppose that T' is a tree too. 

Every tree X has a single- vertex separator, that is, a vertex v such that no branch 
of X — v has more than n/2 vertices; see, e.g., Ore [591 Chapter 4.2]. The idea of 
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Figure 1. A separator strategy of Spoiler. 



Spoiler's strategy is to pebble such a vertex and to force further play on some non- 
isomorphic branches of T and T', where the same strategy can be applied recursively. 

Thus, in the first round Spoiler pebbles a separator v in T and Duplicator responds 
with a vertex v' somewhere in T'. The component of T — v containing a neighbor 
u of v will be denoted by T vu and considered a rooted tree with the root at u. A 
similar notation will apply also to T' . In the second round Spoiler makes a counting 
move and ensures that u G N{y) and v! G N{y') are pebbled so that the rooted 
trees T vu and T' v , u , are non-isomorphic, see Fig. [TJ The next goal of Spoiler is to 
force pebbling adjacent vertices V\ and u\ in T vu and adjacent vertices v[ and u[ in 
T' u , u , so that T VlUl ^ T^, u , , V(T VlUl ) C V(T VU ), and v(T VlUl ) < v(T vu )/2. Once this 
is done, the same will be repeated recursively. 

To make the transition from T vu to T V1U1 , Spoiler follows three rules. 

Rule 1. If T vu has a branch T ux for some x G N(u) \ {v} such that v(T ux ) < 
v{T vu )/2 and the number of branches isomorphic to T ux is different for T vu and T' v , u , y 
then Spoiler makes a counting move and forces pebbling such x and x' G N(u') \ {v'} 
so that T ux ^ T u i x <- The latter two branches will serve as T V1U1 and T' v , u , . If no such 
branch is available, Spoiler pebbles a separator w of T vu . Note that Duplicator is 
forced to respond with a vertex w' in T' v , u ,. Otherwise we would have dist(w,u) = 
dist(w,v) — 1 while dist(w',u') = dist(w',v') + 1. Therefore, some distances among 
the three pebbled vertices would be different in T and in T' and Spoiler could win 
in less than \ogv{T vu ) + 1 moves by Lemma [3.21 

Rule 2. If T differs from T' by some branch T wx (having a different number of 
occurrences in T' — w') that does not contain u, Spoiler makes a counting move with 
the pebble released from v and forces pebbling such x in T and some x' in T' so 
that T wx £^ T' w , x ,. These branches will serve as T V1U1 and , . (It is possible that 
T' w i x , contains vl or T wx contains u but then the distances among u,w,x are not all 
equal to the distances among u',w',x' and Spoiler quickly wins.) 

Rule 3. Denote the branch of T vu — w containing u by T w u (and similarly for 
X"). If Rule 2 is not applicable, then T W;U and T' w , u , are non-isomorphic (where 
an isomorphism would need to respect two pairs of designated vertices, namely u 
and v! as well as the neighbors of w and w'). Assume the harder subcase that 
dist(u,w) = dist(u' , w') . When Spoiler pebbles a vertex y on the path from u to 
w by moving the pebble from v, Duplicator is forced to pebble the corresponding 
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Figure 2. Rule 3 invoked. 



vertex y' on the path from u' to w'. It is easy to see that a vertex y ^ w can be 
chosen so that T — y and T' — y' differ by branches containing neither u and u' nor 
w and w'. Let Spoiler pebble such y as close to w as possible. Note that y ^ u 
because otherwise Rule 1 was applicable. Now Spoiler can make a counting move 
with the pebble released from w to force pebbling z and z' for which 

T yz 9^ T y , z ,, (13) 

see Fig. [2j This complies with the goal of finding new T VlUl and T' v , u , because 

v(T yz ) < v{T WjU ) < v(T m )/2. (14) 

In fact, Duplicator could try to prevent the fulfillment of (1131) by forcing a choice 
of z' such that T' y , z , would contain either u' or w'. In the former case Spoiler could 
win by using differences between the distances among u,y,z and among u ; ,y',z'. 
In the latter case ffTBl would anyway be true because T yz and T' y , z , would have 
different orders. Indeed, since Rule 2 was not applicable, the choice of y ensures 
that the branches of T — y and T' — y' containing w and w' are isomorphic. Thus, we 
would have v(T' y , z i) = v(T ytW ) > v(T vu )/2 (where T y ^ w denotes the branch of T — y 
containing w) while v(T yz ) is strictly smaller by f|T4l) . 

Given T vu , Spoiler finds a new distinguishing branch T VlUl in 3 rounds in the worst 
case. Also, 2 rounds suffice to win the game once the current subtree T vu has at 
most 4 vertices. The number of transitions from the initial branch of order at most 
[n/2] to one with at most 4 vertices is bounded by log[n/2] — 2 because v(T vu ) 
becomes twice smaller each time. Routine calculations (and Lemma I3.2p imply the 
desired bound on the length of the game. □ 

The definability of trees in a finite- variable counting logic within logarithmic quan- 
tifier depth can also be derived from a work by Etessami and Immerman [27] , which 
also implies that counting quantifiers are here not needed as long as the maximum 
vertex degree is bounded by a constant. 

Curiously, Theorem 15.21 sheds some new light on the history of isomorphism test- 
ing for trees. The first record of this history was made by Edmonds, who showed 
that the problem is solvable in linear time (see Theorem 15.11) . Ruzzo j7_I] found 
an AC 1 algorithm under the condition that the vertex degrees of input trees are at 
most logarithmic in the number of vertices. Miller and Reif [55] established an AC 1 
upper bound unconditionally. They wrote [55l page 1128]: "No polylogarithmic par- 
allel algorithm was previously known for isomorphism of unbounded-degree trees." 
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However, the 2- dimensional Weisfeiler-Lehman algorithm has been discussed in the 
literature at least since 1968 (e.g., [77]) and, as we now see by combining Theorem 
15.21 with Theorem 14.51 1. this algorithm does the job for arbitrary trees in NC 2 , i.e., 



To complete this historical overview, we have to mention a result by Lindell [53] 

who showed that isomorphism of trees is recognizable in logarithmic space. Though 
Lindell's result is best possible (see Jenner et al. [IS]), the solvability of the problem 
by so simple and natural procedure as the Weisfeiler-Lehman algorithm still remains 
a noteworthy fact. 

Note that D 2 # (P n ) = § - 0(1) (it is not hard to see that the color refinement 
algorithm requires at least | — 0(1) rounds to distinguish between P n and the 
disjoint union of P n -z and O3). Thus, Theorem 15.21 shows a jump from linear to 
logarithmic quantifier depth when the number of variables is increased just by 1. 
Such width-depth trade-offs were observed and studied by Fiirer [3T] . 

Theorem 15.11 says that 2 variables and counting quantifiers suffice to define any 
tree. Moreover, we could well manage without counting quantifiers but then we 
would need to have A(T) + 1 variables. A simple example of a star, where W(K^ m ) = 
m+1, shows that a smaller number is not enough. The following bound is a variant of 
a result by Immerman and Kozen [36] , who consider definability of trees represented 
by an asymmetric child-parent relation between vertices. 

Theorem 5.3. W(T) < A(T) + 1 for any tree T with the exceptions ofT 6 {Pi, P2}. 

The logical depth of a tree can be bounded in terms of the maximum degree and 
the order. 

Theorem 5.4 (Bohman et al. [12"]). 

1. For every tree T of order n with maximum vertex degree A(T) > 9 we have 



2. Let D(n, d) be the maximum of D(T) over all trees with n vertices and max- 
imum degree at most d = d{n). If both d and log nj log d tend to infinity, 
then 



The upper bounds on D(T) comes from Spoiler's strategy similar to that of the 
proof of Theorem 15.21 that is, Spoiler pebbles a separator v of the given tree T 
and then tries to restrict the game to one of the components of T — v. Informally 
speaking, the worst case scenario for Spoiler is when T — v has d components of 
order about n/d of two different isomorphism types, each occurring half of the time. 
Then Spoiler may need around d/2 extra moves to restrict game to a component 
of T — v (if the components of the counterpart T' — v' are of these two types but 
with different multiplicities). Thus, roughly, Spoiler "reduces" the order by factor 
d using d/2 moves, which gives the heuristic for the bound of Theorem 15.41 2. The 
optimality of this bound is given by a recursive construction of a tree T (and another 



in parallel time 0(log 2 n) ! 
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tree T' ^= T), where at each recursion step we glue together about d/2 trees of two 
different isomorphism types at a common root. 

5.1.2. Graphs of bounded treewidth. Informally speaking, the treewidth of a graph 
tells us to which extent the graph is representable as a tree-like structure. This 
concept appeared in the Robertson-Seymour theory and, aside of its theoretical 
importance, found a lot of applications in design of algorithms on graphs. We do 
not go into any detail here, referring instead to the books [22J and [23] that may serve 
as introductions to, respectively, the structural theory of graphs and the algorithmic 
applications. 

It happens quite often that techniques applicable to trees can be extended to 
graphs whose treewidth is bounded by a constant. In particular, this is true for the 
definability parameters. 

Theorem 5.5. 

1. (Grohe and Marino [3D]) If a graph G has treewidth k, then W#(G) < k + 2. 

2. (Grohe and Verbitsky [32]) If a graph G on n vertices has treewidth k, then 

Df +4 (G) < 2{k + 1) logn + 8k + 9. 

Consequently, isomorphism of graphs whose treewidth does not exceed k is 
recognizable by the (4/c+3) -dimensional Weisfeiler- Lehman algorithm in TC 1 . 

The last claim in the theorem follows a general paradigm provided by Theorem 
14.51 1: A low quantifier depth implies solvability of the isomorphism problem in NC. 
Prior to [32] , for graphs with bounded treewidth only polynomial-time isomorphism 
test of Bodlaender [TT] was known. Very recently Das, Toran, and Wagner [T8j put 
the problem in the complexity class LOGCFL. 

Like Theorem I5.2[ the proof of Theorem 15.51 2 is based on separator techniques. 
In general, a set X C V(G) will be called a separator for graph G if any component 
of G\X has at most n/2 vertices. It is well known [TDJ that all graphs of treewidth 
k have separators of size k + 1. 

5.1.3. Planar graphs. The separator techniques in the study of logical complexity 
of graphs were introduced by Cai, Fiirer, and Immerman [15] . who derived a bound 
W#(G) = 0(y/n) for planar graphs from the known fact [53] that every planar graph 
of order n has a separator of size 0{*Jn). In fact, this result is a particular case 
of Theorem 15.51 1 because planar graphs have treewidth bounded by 5\/5n; see [3j 
Proposition 4.5]. Later Grohe [35] proved that W#(G) for all planar G is actually 
bounded by a constant. 

Without counting quantifiers we cannot have any nontrivial upper bound for the 
logical depth in terms of the order of a graph as long as a class under considera- 
tion contains all trees. However, some natural classes of planar graphs admit such 
bounds. A plane drawing of a graph is called outerplanar if all the vertices lie on 
the boundary of the outer face. Outerplanar graphs are those planar graphs having 
an outerplanar drawing. The treewidth of any outerplanar graph is at most 2. As 
it is well known (see, e.g., [43J), any outerplanar graph is representable as a tree of 
its biconnected components. Note also that an outerplanar graph is biconnected iff 
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it has a Hamiltonian cycle and that such a graph can be geometrically viewed as a 
dissection of a convex polygon. 

Theorem 5.6 (Verbitsky [751176]). 

1. If G is a biconnected outerplanar graph of order n, then D(G) < 22 logn + 9. 

2. For a 3- connected planar graph G of order n we have D 15 (G) < 11 logn + 45. 

Part 2 shows another case when Theorem 14.51 is applicable. It gives an AC 1 
isomorphism test for 3-connected planar graphs and, by a known reduction of Miller 
and Reif [55], for the whole class of planar graphs. This complexity bound for the 
planar graph isomorphism is not new; it follows from the AC 1 isomorphism test for 
embeddings designed in [55] and the AC 1 embedding algorithm in |68| . As a possible 
advantage of the Weisfeiler-Lehman approach, note that it is combinatorially much 
simpler and more direct. In particular, we do not need any embedding procedure 
here. The best possible complexity bound for the planar graph isomorphism is 
recently obtained by Datta et al. [19] who design a logarithmic-space algorithm for 
this problem. 

Theorem l5.6l l is proved in [75] and is based on the existence of a 2- vertex separator 
in any outerplanar graph. The possibility to avoid counting quantifiers relies on 
certain rigidity of biconnected outerplanar graphs. The latter is related to the 
following geometric fact: Any such graph has a unique, up to homeomorphism, 
outerplanar drawing. 

The case of 3-connected planar graphs is much more complicated because the 
smallest separators in such graphs can have about y/n vertices (such examples can 
be obtained by adding a few edges to the grid graph P m x P m ). The proof of Theo- 
rem [5T6J2 in [76] exploits a strong rigidity property of 3-connected planar graphs: By 
the Whitney theorem (see, e.g., [56]), they have a unique, up to homeomorphism, 
embedding into the sphere. An embedding can be represented as a purely combi- 
natorial structure, called a rotation system (see |56J), to which one can extend the 
concepts of definability, isomorphism, the Ehrenfeucht game etc. Defining rotation 
systems is a simpler business because they admit a kind of coordinatization and 
hence an analog of the halving strategy from Lemma 13.21 is available for Spoiler. 
The most essential ingredient of the proof of Theorem 15.61 2 is a strategy for Spoiler 
in the Ehrenfeucht game on graphs allowing him to simulate the Ehrenfeucht game 
on the corresponding rotation systems. 

5.1.4. Graphs with an excluded minor. No graph with treewidth h has Kh+2 as a 
minor. The class of graphs embeddable into a closed 2-dimensional surface 5* is closed 
under minors and, as follows from the Robertson-Seymour Graph Minor Theorem, 
no graph from this class contains a minor of Kh for some h = h(S). Extending 
his earlier work on graphs embeddable into a fixed surface [36], Grohe [38] recently 
announced a proof that, if a graph G does not contain as a minor, then W#(G) 
is bounded by a constant c = c{h). The case of h = 5 is treated in detail in [37] . 

Alon, Seymour, and Thomas [3 J proved that, if a graph G of order n does not 
contain a Kh as a minor, then it has a separator of size at most h^^sjn. Using 
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this result, for all connected graphs with this property one can prove [75] that 
D(G) = 0(h 3 / 2 ^i) + 0(A(G) logn). 

5.1.5. Other classes of graphs. A graph is strongly regular if all its vertices have 
equal degrees and, for some A and fi, each pair of adjacent vertices has exactly A 
common neighbors and each pair of non-adjacent vertices has exactly // common 
neighbors. Non-isomorphic graphs with the same order, degree, and parameters 
A and fi are standard examples of a failure of the 2-dim WL algorithm. Babai 
studies the isomorphism problem for this class in [5J. His individualization-and- 
refmement technique translates into a bound W#(G) < 2^Jn\ogn for all strongly 
regular graphs of a sufficiently large order n with the exception for the disjoint 
unions of complete graphs and their complements (for which we have D^(G) < 3). 
Further improvements are obtained by Spielman |74j . 

Evdokimov, Ponomarenko, and Tinhofer [28] undertake an analysis of the 3- 
dimensional WL algorithm on the classes of cographs, interval graphs, and even di- 
rected path graphs (the latter class extends the class of interval graphs and contains 
also all ptolemaic graphs, in particular, trees). It follows from [28] that W#(G) < 4 
for all G in any of these classes. The boundedness of W#(G) for interval graphs 
follows also from the paper of Laubner [5T] . who uses purely logical methods (while 
Evdokimov et al. develop an algebraic approach that, in fact, originates from the 
seminal work by Weisfeiler and Lehman). 

Grohe [39] proves that W#(G) = 0(1) for all chordal line graphs. On the other 
hand, he shows that there are chordal graphs with W#(G) = Q(n) and the same 
holds true for line graphs. The latter result is obtained by a reduction to the graphs 
with W#(G) = Q(n) constructed by Cai, Fiirer, and Immerman [T3] (cf. Theorem 
15.71 below). Note that the Cai-Furer-Immerman graphs are regular of degree 3, 
where the regularity can be traded for the bipartiteness after a slight modification. 

5.2. General case. 

5.2.1. Identification problem. Recall that 

W#(G,H) < W(G,H) < D(G,H) and W # (G,H) < D#(G,H) < D(G,H). 

If we are motivated by the graph isomorphism problem, it is quite natural to focus 
on these parameters under the assumption that G and H have the same order (even 
without saying that D^(G,H) = 1 otherwise). Distinguishing a graph G from 
all non-isomorphic H of the same order is sometimes called identification problem. 
In particular, we would like to determine or estimate the maximum of D(G,H) 
(resp. D^(G, H)) as a function of n — v(G) = v(H). Equivalently, what is the 
minimum r = r(n) such that Spoiler has a winning strategy in EHR r (G, H) for all 
non-isomorphic G and H of order n? 

By taking disjoint unions of complete and empty graphs, it is easy to find G and 
H with D(G, H) > (n + l)/2. Bounding D^(G, H) from below is much more subtle 
issue. Using a nice nontrivial argument, Cai, Fiirer, and Immerman [15] came up 
with a linear lower bound. 
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Theorem 5.7 (Cai, Fiirer, and Immerman [15J). For infinitely many n there are 
non-isomorphic graphs G and H both of order n such that W#(G,H) > cn, where 
c is a positive constant. 

The calculation of Pikhurko et al. [6U Section 7.5] shows that one can take c = 
0.00465. 

Let us turn to upper bounds. Suppose that G ^ H and v(G) = v(H) = n. Before 
reading further, the reader might try to improve the trivial bound D(G,H) < n 
at least somewhat. It may be seen as a curious observation that D(G, H) < n — 1 
follows from the Harary version of the Ulam Reconstruction Conjecture, open for a 
long time, claiming that non-isomorphic graphs of equal orders have different sets 
of vertex-deleted subgraphs. 

One solution of this exercise, giving D(G, H) < n — ~ logri, is to apply the Erdos- 
Szekeres bound on Ramsey numbers. It implies that every graph G of large order 
n contains a homogeneous set of more than \ log n + \ log log n vertices. Spoiler 
pebbles the complement of such a set 5* in G. Suppose that the unpebbled set is 
independent (otherwise we can play on the complementary graphs). If Duplicator 
is lucky, she manages to pebble the complement to an independent set S' in H so 
that G \ S = H \ S'. Identifying the pebbled parts, Spoiler compares the number 
of vertices in 5* and in 5" with the same neighborhood. These numbers cannot be 
identical for G and H and, by v{G) = v(H), Spoiler can demonstrate this using at 
most (\S\ + l)/2 further moves in one of the graphs. 

After this warm-up, we can state an almost optimal bound. 

Theorem 5.8 (Pikhurko, Veith, and Verbitsky [S3])- For every two non-isomorphic 
graphs G and H of the same order n we have D(G, H) < (n + 3)/2. 

5.2.2. General bounds for the logical depth and width. In the case of the counting 
logic, Theorem 15.71 provides us with infinitely many graphs G for which D^(G) > 
W#(G) > 0.00465 n. As usually, n denotes the order of a graph. An upper bound 
easily follows from Theorem 15.81 we have D^(G) < 0.5 n + 1.5 for all G. Though 
this bound does not use the power of counting quantifiers at all, we are not aware 
of any better bound. 

Consider the standard first-order logic (without counting). At the first sight, 
everything is clear here. Indeed, the general upper bound W(G) < D(G) < n + 1 is 
attained, even for the width, by the complete graph K n and by the empty graph K n . 
However, these are the only two extremal graphs. In other words, D(G) < n for all 
G with exception of G £ {K n , K n }. As K n and K n are the most symmetric graphs, 
this observation suggests two problems. The first one is to prove a better bound 
for a class of graphs with restrictions on the automorphism group. The second is to 
obtain, for as small as possible I = l(n), an explicit or algorithmic description of all 
order-n graphs whose logical depth (resp. width) exceeds /. We start with the first 
problem. 

Definition 5.9. Let u, v, and s be three vertices and s (jL {u, v}. We say that s 
separates u and v if s is adjacent to exactly one of the two vertices. Furthermore, 
we call u and v twins if no s separates u and v (or, equivalently, if the transposition 
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of u and v is an automorphism of the graph). A graph is called twin-free if it has 
no twins. 

Theorem 5.10 (Pikhurko, Veith, and Verbitsky [63J). If G is twin-free, then 
Di(G) < (n + 5)/2. 

Theorem 15.10 1 cannot be improved to a sublinear bound. Indeed, consider mP 4 , the 
disjoint union of m copies of P4. As it is easily seen, mP± is twin-free and D(mP 4 ) > 
D^mP^, (m + l)Pi) > m (the reader is welcome to play EHR m (mPi, (m + l)Pi) on 
Duplicator's side). No sublinear improvement is possible even for connected graphs: 
the graphs constructed by Cai, Fiirer, and Immerman in Theorem 15.71 are twin-free. 

We prove Theorem 15.101 based on Lemma 12.51 1 and the Ehrenfeucht Theorem 
(Theorem 13.31 1). That is, we design a strategy allowing Spoiler to win EHR r (G, H) 
for any H ^ G, where r = [in + 5)/2j. As an important additional feature of the 
strategy, Spoiler will alternate between the graphs only once. By Theorem l3.3l 2. this 
shows that our bound holds even for the logic with only one quantifier alternation 
(as it is indicated by the subscript in Theorem 15. 10[) . 

Definition 5.11. Let X C V(G). Given two vertices u, v G V(G) \X, we call them 
X-similar and write u =x v if u and v are inseparable by any vertex in X, i.e., if 

N(u) ni = N(v) n x. 

Now, let y ^ X. We say that X sifts out y if for every y' ^ X the relation y =x y' 
implies y' = y (in other words, the vertex y is uniquely identified by its adjacencies 
to X). Let S(X) consist of all x G X and all y sifted out by X. We call X a szei>dl 
if S(X) = V(G). Furthermore, X is called a weak sieve if S(S(X)) = V(G). 

Consider the Ehrenfeucht game on non-isomorphic G and H and assume that X 
is a sieve in G. Let Spoiler pebble all vertices of X. We leave to the reader to verify 
that Spoiler can win in at most 2 more moves. We now describe a more advanced 
Weak Sieve Strategy. 

Lemma 5.12. If X is a weak sieve in G, then Spoiler is able, for any H ^ G, to 
win EHR r (G, H) with r < \X\ + 3. Moreover, he does not need to jump from one 
graph to the other more than once during the game. 

Proof. First, Spoiler selects all of X. Let X' C V(H) be the Duplicator's reply. 
Assume that Duplicator has not lost yet. For the notational simplicity let us identify 
X and X' so that V(G) H V(H) = X = X' and the player's moves coincide on X. 
Let Y = S(X) in G and Y' = S(X') in H. 

It is not hard to see that Spoiler wins in at most two extra moves unless the 
following holds. For any y 6 Y \ X there is a y' G Y' \ X (and vice versa) such that 
N(y) nl = N(y') PI X. Moreover, this bijective correspondence between Y and Y' 
establishes an isomorphism between G[Y] and Cr'fY 7 ]. 

Suppose that this is the case and identify Y with Y' . Let Z = V \ Y and 
Z' = V'\ Y. Let z G Z and define 

W' x = { z' G Z' : N(z') n Y = N(z) n Y} . 



'Babai [5] uses sieves under the name distinguishing sets. 
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If W' z = 0, Spoiler wins in at most two moves. First, he selects z. Let Duplicator 
reply with z' . Assume that z' G Z' for otherwise she has already lost. As the 
neighborhoods of z, z' in Y differ, Spoiler can demonstrate this by picking a vertex 
of Y. If \W' Z \ > 2, then Spoiler selects any two vertices in W' z and wins with at most 
one more move, as required. 

Hence, we can assume that for any z we have W' z = {f(z)} for some f(z) G Z'. 
Since each vertex in Z is sifted out by Y, the function / is injective. If f(Z) ^ Z', 
Spoiler easily wins in two moves. Suppose, therefore, that / : Z — > Z' is a bijection. 
As G ^ H, the mapping / does not preserve the adjacency relation between some 
y,z G Z. Now, Spoiler selects both y and z. Duplicator cannot respond with f(y) 
and f(z); by the definition of / Spoiler can win in one extra move. □ 

Theorem 15.101 immediately follows from Lemma 15.121 and the next lemma. 

Lemma 5.13. Any twin-free graph G on n vertices has a weak sieve X with \X\ < 
(n-l)/2. 

Proof. Given X C V(G), let C{X) denote the partition of X = V(G) \ X into 
=x-equivalence classes. Starting from X = 0, we repeat the following procedure. 
As long as there exists u G X such that \C(X U {u})\ > \C(X)\, we move u to 
X. As soon as there is no such u, we arrive at X which is C-maximal, that is, 
\C(X U {u})\ < \C{X)\ for any u G ~X. Note that \C(X)\ > \X\ + 1 because this 
inequality is true at the beginning and is preserved in each construction step. Using 
also the inequality \X\ + < n, we conclude that \X\ < (n — l)/2. 

We now prove that the A is a weak sieve. Suppose, to the contrary, that u and v 
are distinct 5(A)-similar vertices in Z = V(G) \ S(X). Since G has no twins, these 
vertices are separated by some s. We cannot have s G S(X) by the definition of 
«S(A)-similarity. Thus s 6 Z. Let G\ be the class in C(X) including {u,v} and C2 
be the class in C(X) containing s. Since s ^ S(X) \ X, the class C2 has at least one 
more element in addition to s. If C\ ^ C2, moving s to X splits up G\ and does not 
eliminate CV If C\ = C2, moving s to X splits up this class and splits up or does 
not affect the others. In either case |C(A)| increases, giving a contradiction. □ 

The proof of Theorem l5.10l is complete. This theorem was significantly extended in 
[63J giving some progress on the second research problem stated above. In particular, 
it was shown that one can efficiently check whether or not D(G) < (n + 5)/2 for the 
input graph G of order n and, if this is not true, then one can efficiently compute 
the exact value of D{G). Also, the same holds for W(G). 

This result is interesting in view of the fact that algorithmic computability of the 
logical depth and width of a graph, even with no efficiency requirements, is unclear. 
A reason for this is that the question if a given first-order sentence defines some 
graph is known to be undecidable |6T] . 

The upper bound of \n + 0(1) can be improved if we impose a restriction on the 
maximum vertex degree. 

Theorem 5.14 (Pikhurko, Veith, and Verbitsky |63j). Let d > 2. Let G be a graph 
of order n with no isolated vertex and no isolated edge. If A(G) < d, then 

Di(G) < c d n + d 2 + d + 4 
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for a constant Cd = ^ ~ 4^ 5 - 

Theorem 15.141 aims at showing a constant a strictly less than 1/2 rather than 
at attempting to find the optimum c<j. In the case of d = 2, which is simple and 
included just for uniformity, an optimal bound is D\(G) < n/3 + 0(l). Without the 
assumption that G has no isolated vertex and edge, the theorem does not hold for 
any fixed Cd < 1/2. A counterexample is provided by the disjoint union of isolated 
edges. Even under the stronger assumption that G is connected, Theorem 15.141 
still does not admit any sublinear improvement: the Cai-Furer-Immerman graphs 
in Theorem 15.71 are connected and have maximum degree 3. 

6. Average case bounds 

In Section[5]we investigated the maximum values of logical parameters over graphs 
of order n. Now we want to know its typical values. A natural setting for this 
problem is given by the Erdos-Renyi model of a random graph G UjP . The latter is 
a random graph on n vertices where every two vertices are connected by an edge 
with probability p independently of the other pairs. A particularly important case 
is Gr n ,i/2j when we have the uniform distribution on all graphs on a fixed set of n 
vertices. Whenever we say that for a random graph of order n something happens 
with high probability (abbreviated as whp), we mean a probability approaching 1 as 
n — > 00. 

6.1. Bounds for almost all graphs. 

6.1.1. Logic with counting. We begin with a simple but useful observation about the 
color refinement algorithm described at the beginning of Section HI If the coloring 
of a graph stabilizes with all color classes becoming singletons, it can be considered 
a canonical vertex ordering. It turns out that this happens for almost all graphs. 
This result can be used to estimate the logical complexity of almost all graphs, in 
particular, to show that almost surely W#(G ntl/2 ) = 2 (Immerman and Lander [4"7]). 

Theorem 6.1. 

1. (Babai, Erdos, and Selkow [7]) 2 color refinements split a random graph G nA/2 
into color classes which are singletons with probability more than 1 — 1/ y/n, 
for all large enough n. Consequently, D^(G nil/2 ) < 4 with this probability. 

2. (Babai and Kucera [8]) 3 color refinements split a random graph G ntl/2 into 
color classes which are singletons with probability more than 1 — l/2 cn , for 
a constant c > and all large enough n. Consequently, D\(G njl/2 ) < 5 with 
this probability. 

The logical conclusions made in Theorem 16.11 are based on the necessity part of 
Theorem 14.41 It suffices to notice that, once the color refinement splits the vertex 
set of an input graph G into singletons, one extra round of the algorithm suffices to 
distinguish G from any non-isomorphic graph. 

Next, we are going to show that the upper bound of Theorem 16. Il l is best possible. 
Let Cq denote the coloring of the vertex set of a graph G produced by the color 
refinement procedure in r rounds. 
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Lemma 6.2. Whp for G = G nA/2 there exists a non-isomorphic graph H on V = 
V(G) such that Cq(x) = Cj I (x) for every vertex x G V. 

Proof. Let G be a typical graph of a sufficiently large order n. In particular, we 
assume that G satisfies Theorem 16. Il l and that | dega; — n/2\ < m for every vertex 
x of G, where we can take e.g. m = <J (n logn)/2 by a simple application of 
Chernoff's bound (see also [T3| Corollary 3.4]). By the Pigeonhole Principle, there 
is a set U of u — \n/(2m + 1)] vertices all having the same degree. 

Another property of the random graph G that we assume is that every set X C V 
of size u contains distinct vertices w, x, y, z with wx, yz G E(G) and xy, zw G" E{G). 
Indeed, let us fix a it-set X C V and estimate the probability that it violates this 
property. One can find at least Uj/S edge-disjoint 4-cycles inside the complete 
graph on X. (For example, picking up cycles one by one arbitrarily, we get enough 
of them by the well-known fact that a C^free graph on u vertices has 0(w 3//2 ) edges, 
see, e.g., [2J Chapter 25.5].) For each 4-cycle on vertices Xi,x 2 , x 3 , x 4 in this order, at 
least one of the relations Xix 2 ,x 3 x 4: G E(G) and x 2 x 3 ,x 4: Xi G" E{G) should be false, 
this having probability 15/16. By the edge-disjointness, these events for different 
selected cycles are mutually independent. Hence X violates the desired property 
with probability is at most (15/16)( 2 )^ 5 = o((") 1 ). Since there are (") candidates 
for a bad set X, the probability that it exists is o(l), giving the required. 

Hence, the equidegree set U contains vertices w,x,y,z with wx,yz G E(G) and 
xy,zw G" E{G). Let H be obtained from G by removing edges wx,yz and adding 
edges xy, zw. This operation preserves the degree of every vertex as well as the 
multiset of degrees of its neighbors, that is, Cq{v) = Cfjiv) for every vertex v G V. 

Suppose that G and H are isomorphic. Any isomorphism / must preserve the 
C 2 -colors. Since C 2 -classes are all singletons, / has to be the identity map on 
V(G) = V(H). But then the adjacency between, e.g., w and x is not preserved, a 
contradiction. The lemma is proved. □ 

Given a typical G = G Ujl/2 , let if be a graph satisfying Lemma I6T2"1 Thus, the 2- 
round color refinement fails to distinguish between G and H. By the sufficiency part 
of Theorem I4.4[ we have D\{G) > 3. As an alternative proof, the reader can design 
a winning strategy for Duplicator in the counting game Ehr^G, H). Together with 
Theorem 16.11 1. this bound gives us the exact value D\{G njl/2 ). 

Theorem 6.3. Whp D|(G? n , 1/a ) = 4. 

We always have Du{G) < D\{G) and, on the other hand, Du{G) < 2 implies 
E\{G) < 2 because any definition with quantifier depth 2 can be rewritten with 
using only 2 variables. It follows from Theorem 16.31 that 3 < D^{G ntl/2 ) < 4 whp. 
Unfortunately, we could not decide whether the typical value of DJG ntl/2 ) is 3 or 
4, which seems to be an interesting question. 
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6.1.2. Logic without counting. 

Theorem 6.4 (Kim et al. [50] )• Fix an arbitrarily slowly increasing function u = 
ui{n). Then we have whp that 

log n — 2 log log n + log log e + 1 — o(l) < W(G nA/2 ) < 

< J Di(G njl/2 ) < logn - log log n + u. 

We first prove the lower bound. 

Definition 6.5. For an integer k > 1, we say that a graph G has the k- extension 
property if, for every two disjoint 1,7 C V(G) with \X U Y\ < k, there is a vertex 
z ^ X UY adjacent to all x G X and non-adjacent to all y G Y. 

Lemma 6.6. If both G and H have k-extension property, then W(G, H) > k + 2. 

Proof. By Theorem 13.31 3 it suffices to design a strategy allowing Duplicator to sur- 
vive in EHR fc+1 (G, H) arbitrarily long. Suppose that Spoiler puts pebble p on a 
new position v in one of the graphs, say, G. Let X (resp. Y) denote the set of 
pebbled vertices in H whose counter-parts in G are adjacent (resp. non-adjacent) to 
v. Duplicator moves the other copy of p to a vertex z with the given adjacencies to 
X UY whose existence is guaranteed by the fc-extension property. □ 

Lemma 6.7. Let e > be a real constant. Then the k-extension property holds for 
G n ,i/2 whp for any k < log n — 2 log log n + log log e — e. 

Proof. Let n be large. Any particular X and Y with \X U Y\ = k falsify the k- 
extension property with probability (1 — 2~ k ) n ~ k . Since the number of such pairs is 
(^)2 fc , a random graph G nA/2 does not have the fc-extension property with probability 
at most 

rj 2 fc (l - 2- k ) n - k < n k (l - 2~ k ) n < exp {klnn - n2- k ) . 

The former inequality is true only if k > 4 but this makes no problem because the 
fc-extension property implies itself for all smaller values of parameter k. Since the 
function f(x) = xlnn — n2~ x is monotone, the fc-extension property fails with the 
probability bounded from above by 

exp {/(logn — 2 log logn + log log e — e)} = 

= exp {(In 2) (-2 e + 1 + o{\)) log 2 n) = o(l), 

as it was claimed. □ 

Fix e > 0. Let n be sufficiently large and set k = [log n — 2 log log n + log log e — ej . 
By Lemma 16.71 G = G n ^ /2 has the /c-extension property whp. Let if be a graph 
which also possesses the /c-extension property and is non-isomorphic to G. The 
existence of such a graph follows also from Lemma 16.71 Given G, let H = G njl/2 
be another, independent copy of a random graph. It should be only noticed that 

H = G with probability at most n!2~( 2 ) = o(l). By Lemma [6.61 we have 
W(G) > W{G,H) >k + 2> logn -21oglogn + log loge + 1 - e, 
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thereby proving the lower bound of Theorem 16.41 

To prove the upper bound, we employ the Weak Sieve Strategy that was designed 
in Section 15.2.21 Lemma 15.121 allows us to estimate the parameter Di (G) by the 
size of a weak sieve existing in G. The upper bound of Theorem 16.41 follows from 
Lemmas 16 . 8 1 b elow . that gives us a good enough bound for the size of a weak sieve in 
a random graph. The paper [50] states a slightly weaker upper bound than that in 
Theorem 16.41 (namely, u = Clog log log n there). The current more precise estimate 
is due to Joel Spencer (unpublished). 

Lemma 6.8. Fix an arbitrarily slowly increasing function u> = u(n). Then whp 
G n ,i/2 has a weak sieve of size at most logn — log Inn + oj. 

Proof. We will consider a random graph G ntl/2 on an n-vertex set V. Fix X C V 
with \X\ = logn — s, where s = log Inn — u. We generate G„ jl/2 in two stages. 

Stage 1: reveal the edges between X and V \ X (needless to say, each such edge 
appears with probability 1/2 independently of the others). Our goal at this stage is 
to show that S{X) is large whp. 

A fixed y G V \ X is sifted out by X with probability 

(1 _ 2 -W)n-|X|-l = cxp { _ 2S(1 + o(1))} = n -2-«(l+o(l)) = n - (l)_ 

By linearity of expectation 

E [ \S(X) \ A| ] = (n - |X|)n-°W = n 1 '^. 

We can now apply the martingale techniques to show that whp |<S(A) \ X\ is con- 
centrated near its mean value. More precisely, we need the following estimate: 



P 



\S(X)\X\<E[\S{X)\X\]-2Xy/n-\X\ <e~ x ^ 2 (15) 



for any A > 0, where P [ A] denotes the probability of an event A. 

To prove it, consider the probability space consisting of all functions g : V \X — > 
2 X . Define a random variable L on this space by setting L(g) to be equal to 
the number of values in 2 X taken on by g exactly once. Note that, if g and 
g' differ only at one point, then \L(g) — L(g')\ < 2. Construct an appropri- 
ate martingale as explained in the Alon-Spencer book [H Chapter 7.4]. Namely, 
let V \ X = {yi, . . . ,y m } and define a sequence of auxiliary random variables 
X , Ai, . . . , X m by Xi = E [\L(g) \ g(yj) = h(yj) for all j < i] . By Azuma's in- 
equality (see (H Theorems 7.2.1 and 7.4.2]), for all A > we have 



P [L{g] <E[L(g)}- 2\^] < e 

which is exactly what is claimed by ( II 5p . 
By (fl~5l) we have whp that 

\S(X) \X\ > n 



-A 2 /2 



l-o(l) 



Conditioning on S(X) satisfying this bound, we go to the next stage of generat- 
ing G„, 1/2 . 
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Stage 2: reveal the edges inside V \ X. It is enough to show that V \ <S(X) C 
S(S(X) \ X) whp. If the last claim is false, then there are z, z' G V \ S{X) having 
the same adjacencies to S(X) \ X. This happens with probability no more than 

Q 2 -l 5 W<n 2 2-^° (1) = (l). 

The proof is complete. □ 

Theorem 16.41 shows rather close lower and upper bounds for the logical width and 
depth of a random graph G n)1/2 - Surprisingly, even this can be improved. 

Theorem 6.9 (Kim et al. [50J ) - For infinitely many n we have whp 
D 2 (G nA/2 ) < logn - 21oglogn + logloge + 6 + o(l). 

This upper bound is at most by 5 + o(l) larger than the lower bound of Theorem 
16.41 It follows that, for infinitely many n, the parameters D^G) with i > 2, D(G), 
and W(G) are all concentrated on at most 6 possible values (while some extra work, 
see [5UI Section 4.3], gives a 5-point concentration). 

6.1.3. Bounds for trees. 

Theorem 6.10 (Bohman et al. [I2])- Let T n denote a tree on the vertex set 
{1, 2, . . . , n} selected uniformly at random among all n n ~ 2 such trees. Whp we have 
W(T n ) = (1 + o(l))^ and D(T n ) = (1 + o(l))^. 

The lower bound for W(T n ) immediately follows from the following property of 
a random tree: whp T n has a vertex adjacent to (1 + o(l)) ^°f^ n leaves. Note that 
the upper bound for D(T n ) does not follow directly from Theorem 15.41 because whp 
A(T n ) = (1 + o(l))^ (see Moon [57]). 

6.2. An application: The convergency rate in the zero-one law. We will 

write G |= $ to say that a sentence <3> is true on a graph G. Let p n (&) = 
P[G njl/2 |= $]. The 0-1 law established by Glebskii et al. [32] and, independently, 
by Fagin [29] says that, for each $, p n {&) approaches or 1 as n — > oo. Denote the 
limit by p($). 

Define the convergency rate function for the 0-1 law by 

R(k, n) = max { |p„($) - | :£>($)< jfe} . 

Note that the maximization here can be restricted to a finite set by Theorem 12.41 1. 
Therefore, the standard version of the 0-1 law implies that R(k, n) — > as n — > oo for 
any fixed k. Naor, Nussboim, and Tromer [58J showed that -R(log n— 2 log log n, n) — > 
0. Another result in [58] states that one can choose p(n) = 1/2 + o(l) and k(n) = 
(2 + o(l)) logn such that the probability that G„ iP has a k(n)-c\ique is bounded away 
from and 1. Thus for this probability p(n) the 0-1 law does not hold with respect 
to formulas of depth k(n). 

The following theorem sharpens slightly the first part of the above result and 
improves on the second part in two aspects: we do not need to change the probability 
p = 1/2 and we get an almost best possible upper bound. 
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Theorem 6.11. Let g(n) = logn — 2 log log n + log log e + c with constant c. 

1. If c < 1, then R(g(n),n) — > as n — )■ oo. 

2. If c > 6, then R(g(n),n) does not tend to as n — » oo. More strongly, for 
every 7 6 [0, 1] t/iere zs a sequence of formulas $ ni , $ n2 , . . . (where ni < n i+ i) 
with D{Q ni ) < g{ni) such that p ni ($„J — Y 7 as i — >■ 00. 

Part 2 follows from Theorem 16.91 The latter implies that (for infinitely many 
n) actually any property V of graphs on n vertices can be "approximated" by a 
first-order sentence of depth at most g(n). Indeed, take the conjunction of defining 
formulas over all graphs in V of order n and depth at most g(n). The omitted graphs 
constitute negligible proportion of all graphs by Theorem 16.91 

We now prove Part 1. Like the proof in [58] we use the extension property, but 
we argue in a slightly different way. 

Proof of Part 1. Let E k denote a first-order statement of quantifier depth k express- 
ing the (k — l)-extension property. Lemma 16.71 provides us with an infinitesimal 
a{n) such that 

1 — p n (E k ) < a(n) as long as k < g(n). (16) 

We will consider g(x) on the range x > 2. This function is decreasing for x < e 2 
and increasing for x > e 2 . Since g(2) < 3, for any k > 3 there is some n such that 
the conditions g{n) > k and n > n are equivalent. We fix a value k > 3 so that 
1 — a(n) > v3/2 whenever g(n) > k . Note that 

Pn(E k ) > — whenever g(n) > k > k . (17) 

The result readily follows from the following fact. 

Claim A. If k < k < g(n), then for every first-order statement $ with -D($) = k 
we have 

\p n ($) - p($)\ <2a(n). (18) 
We will prove first a more modest bound. 

Claim B. If k Q < k < g(n), then for every first-order statement $ with -D($) = k 
we have 

< 1/2. (19) 

Proof of Claim B. Consider a pair of integers k and n such that k < k < g(n). Let 
$ be a first-order statement with -D($) = k. Without loss of generality, suppose 
that £>($) = 0. If Pat($) > 1/2 for some N, let denote the largest such number. 
For M = N + 1 we have Pm($) < 1/2. Let Gn and Gm be independent random 
graphs with, respectively, N and M vertices. Note that 

F[D(G N ,G M ) < k] > F[G N |= $ & G M ft $] > i 

On the other hand, Lemma 16.61 implies that 



¥[D(G N ,G M ) >k]>F[G N \=E k LG M ft E k ] = p N (E k )p M (E k ). 
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It follows that p]y(Ek)pM(Ek) < 3/4 and, therefore, pN(E k ) < VS/2 or PM{E k ) < 
y/3/2. Comparing this with (fTTj) . we conclude that g(N) < k or g(M) < k, which 
implies that n > N. The desired bound f|T9|) follows now by the definition of N. < 
Proof of Claim A. Consider a pair of integers k and n such that k < k < g{n). 
Let $ be a first-order statement with -D($) = k. Let G' and G" be two independent 
copies of G n ,i/2- By Lemma [6T6l 

F[D(G', G") >k]>F[G'\=E k LG"^E k } = p n {E k f. 

On the other hand, 

F[D(G',G") > k] < P [ G" and G" are not distinguished by $] = p n ($) 2 +(l-p n ($)) 2 . 
Combining the two bounds, we obtain 

2p n ($)(l-p n ($)) < l-p n (^) 2 . 
Using Claim B, we immediately infer from here that 

|p„($) < 1 -p n {E k f < 2{l- p n {E k )). 

The desired bound (fIBI) follows now from ( ITBT) . < □ 

6.3. The evolution of a random graph. We now take a dynamical view on a 
random graph G n # by letting the edge probability p vary. With p varying from 
to 1, G n , p evolves from empty to complete. We want to trace the changes of its 
logical complexity during the evolution. Since the definability parameters do not 
change when we pass to the complement of a graph, we can restrict ourselves to case 
p < 1/2. 

When p is a constant, one can estimate D(G) within additive error O (log log n). 
Theorem 6.12 (Kim et al. [50J ) . If < p < 1/2 is constant, then whp 

\og 1/p n - ci In Inn - 0(1) < W(G n , p ) < D(G rhP ) < log 1/p n + c 2 In Inn, 

where c\ = 2 ln -1 (l/p) and C2 = (2 + o(l)) (— p\np — (1 — p) ln(l — p))^ 1 — c%. 

Sketch of Proof. Similarly to Theorem 16.41 the lower bound is based on the k- 
extension property. However, the proof of the upper bound is quite different. In 
particular, we have hardly any control on the alternation number in this result. The 
argument is rather complicated so we give only a brief sketch, concentrating more 
on its logical rather than probabilistic component. 

Let G = G n , p be typical and G' ¥ G be arbitrary. Let V = V(G) and V = V(G'). 
For a sequence X of vertices, let Vx = {y G V : Vx G X xy G E(G)} and 
Gx = G[Vx]- Let the analogous notation (with primes) apply to G'. If there is 
x G V such that for every x' G V we have G x ^ G' x ,, then Spoiler selects x. 
Whatever Duplicator's reply x' G V is, Spoiler reduces the game to non-isomorphic 
graph G x and G' x ,. We expect that \V X \ = (p + o(l))n and G x is also 'typical'. Thus 
Spoiler used one move to reduce the order of the random graph by a factor of p, 
which should lead to the upper bound D(G) < (1 + o(l)) log 1 / p n. 

Suppose now that there are x G V and distinct y', z' G V such that G x = 
G ' , = G' z ,. Spoiler selects y' G V. Assume that Duplicator replies with y = x, 



LOGICAL COMPLEXITY OF GRAPHS: A SURVEY 



37 



for otherwise G y ^ G' , and Spoiler proceeds as above. Now Spoiler selects z'\ let 
z G V be the Duplicator's reply. We can assume that G y>z = G' y , z ,, for otherwise 
Spoiler applies the inductive strategy to the (G y>z , G y ' iZ >)-game, where the order of 
the random graph is reduced by factor (l + o(l))p 2 . Let U = V VjZ and U' = V' y , z ,. A 
first moment calculation shows that there is vertex v G V y \ U such that no vertex of 
V Z \U has the same neighborhood in U as v . Let Spoiler select v and let v' G V y , \ U' 
be the Duplicator's reply. Two copies G' y , and G' z , of a 'typical' graph G x have a 
large vertex intersection. Another first moment calculation shows that whp there is 
only one way to achieve this, namely that the (unique) isomorphism / : Vy, — > V' z , 
between G' y , and G' z , is in fact the identity on U'. But then f{v') has the same 
adjacencies to U' as v'. Spoiler selects f(v') and wins the game in at most one extra 
move. 

Finally, up to a symmetry it remains to consider the case that there is a bijection 
g : V — > V such that for any iGV we have G x = G' g r x y 

As G ^ G', there are y, z G V such that g does not preserve the adjacency 
between y and z. Spoiler selects y. We can assume that Duplicator replies with 
y' — g{y) f° r otherwise Spoiler reduces the game to G y . Now, Spoiler selects z to 
which Duplicator is forced to reply with z' 7^ g{z). Let w = g~ 1 {z l ). Assume that 
Gy, z — G y i, z > for otherwise Spoiler applies the inductive strategy to these graphs. 
But then G y>z is an induced subgraph of G w = G' z ,, a property that we do not expect 
to see in a random graph. 

In order to convert this rough idea into a rigorous proof one has to show that 
whp as long as the subgraphs G X1)X2) ... that can appear in the game are sufficiently 
large, they have all required properties. Also, one has to design Spoiler's strategy 
to deals small subgraphs of G n , v at the end of the game. All details can be found 
in [501 Section 3]. □ 

It is interesting to investigate the behavior, e.g., of D(G n>p ) when p = p{n) tends 
to zero. In particular, it is open whether, for every constant 5 G (0, 1) and n~ s < 
p(n) < 1/2 we have whp D(G ntP ) = O(logn). 

Some restriction on p{n) from below is necessary here. Indeed, let G be an arbi- 
trary non-empty graph (i.e., G has at least one edge) and let G' be obtained from 
G by adding one more isolated vertex. It is easy to see that W(G, G') > d (G) and 
D(G, G') > do(G) + 1, where d (G) denotes the number of isolated vertices of G. It 
follows that 

W(G) > do(G) + 1 and D(G) > d {G) + 2. (20) 
It is well known (see, e.g., [T3]) that 

d (G ntP )^{e~^ + o(l))n (21) 

whp as long as p = Oin" 1 ). In particular, we have W(G ntP ) = (1 — o{l))n whenever 
p = o(n _1 ). 

In some cases, the lower bounds fTSOj) are sharp. 

Lemma 6.13. Let cp(G) denote the number of connected components in a graph G 
isomorphic to a graph F . Suppose that a non-empty graph G satisfies 

cf(G) + v(F) < do(G) + 1, for every component F of G. (22) 
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Then W(G) + 1 = D(G) = D^G) = d (G) + 2. 

Proof. Let us show that D 1 (G, H) < d (G) + 2 for any H p G. Let F be such 
that cp{H) 7^ cp{G). For definiteness suppose that cp(H) > cp{G). Spoiler marks 
cp(G) + 1 components of H which are isomorphic to F by pebbling one vertex in 
each of them. Duplicator is forced either to mark one of the F-components of G 
twice (by pebbling two vertices, say, u and v in it) or to mark a component F' of 
G which is not isomorphic to F. In the former case Spoiler wins by pebbling a 
path from u to v. In the latter case Spoiler pebbles completely the F-component 
of H corresponding to F' . Duplicator is forced to pebble a connected part F" of 
F' . If she has not lost yet, then F" = F and hence F" is a proper subgraph of F' . 
Spoiler wins by pebbling another vertex in F' which is adjacent to a vertex in F" . 
Altogether at most d (G) + 2 moves are made. 

It remains to prove the upper bound on the width. The last move may require 
using the (do(G) + 2)-th pebble. However, for this purpose Spoiler can reuse a pebble 
placed earlier in a component different from F'. This trick is unavailable only if 
cf(G) = 1 and cf(H) = or if cf{G) = and cf(H) = 1. In both cases Spoiler 
can win in at most v(F) + 1 rounds (and at most one alternation). Moreover, if this 
number is at least do(G) + 2, then cp{G) = and G has no component with v(F) 
or more vertices by (I22p . In this case, Spoiler can win in at most v(F) moves. □ 

Theorem 6.14 (Kim et al. [5U], Bohman et al. [I2])- If P = c / n with c = c(n) > 
being an arbitrary bounded function of n, then D(G n ^ p ) = (e~ c + o(l))n whp. 

Sketch of Proof . It is well known that, observing the evolution process in the scale 
p = c/n, at the point c = 1 we encounter the phase transition. If c < 1 — e, whp all 
components of G n , P have O(logn) vertices each; if c > 1 + e, there appears a unique 
exception, the so-called giant component with a linear number of vertices. 

One can check that, for any c < a — e, Condition fl22l) holds whp (even for 
the giant component if it exists), where a = 1.1918... is a root of some explicit 
equation, see [301 Theorem 19]. Then, by Lemma 16.131 and Equality ( 12"T|) . W(G njP ) 
and Di(G n ^ p ) (and all parameters in between) are (e _c + o(l)) n. When c is larger 
than a + e, then whp the giant component of G n ^ p violates (1221) : Its order exceeds 
the number of isolated vertices. This case is handled in [12] as follows. 

Denote the giant component of G = G ri)P by M. Given H ^ G, we have to 
design a strategy allowing Spoiler to fast enough win the Ehrenfeucht game on G 
and H. The strategy in the proof of Lemma [6. 131 does not work only if cm{H) = 
or cm{H) > 2. We adapt it for these cases so that Spoiler, instead of selecting all 
vertices of M, plays an optimal strategy for M using at most D(M) + logn + 1 
moves (instead of v(M) + 1 moves as earlier). 

First, we can assume that no component of H has diameter n or more. Otherwise 
Spoiler pebbles u and v at distance n in H . For Duplicator's responses v! and v' 
in G we have either dist(u',v') < n or dist(u',v') = oo. Hence Spoiler wins in less 
than log n + l moves. 

Second, we can assume that Duplicator always respects the connectivity relation 
(two vertices are in the relation if they are connectable by a path). Indeed, suppose 
that u and v belong to the same connected component F in one of the graphs while 
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their counter-parts v! and v' are in different components of the other graph. Then 
Spoiler wins in less than log diam(F) + 1 moves. 

Under this assumption, Spoiler easily forces that, starting from the 2nd round, 
the play goes on components of G and H, of which exactly one is isomorphic to M. 
One of the main results of [12] states that whp 



which implies that Spoiler is able to win quickly and proves the theorem. 

The upper bound ( 1231) is obtained roughly as follows. By iteratively removing 
vertices of degree 1 from the giant component M, one obtains the core C of M 
(that is, C is a maximum subgraph with minimum degree at least 2). The kernel 
K of G is the serial reduction of C, that is, we iterate the following to obtain K: If 
there is a vertex x of degree 2, then we remove x but add edge {y, z}, where y and z 
are the two neighbors of x. The kernel may have loops and multiple edges and has to 
be modeled as a colored graph. The original graph G can be encoded by specifying 
its kernel K and the structure of rooted trees that correspond to each vertex or edge 
of K, the latter being viewed as a total coloring of K. It happens that whp every 
vertex x of K can be identified by a small-depth formula $ x with one free variable 
(that is K, x \= Q x while K, y <& x for every other vertex y G V(K)) in the first- 
order language of colored graphs. Thus one can define K succinctly by stating that 
for every x G V(K) there is a unique vertex satisfying Q x , that every vertex satisfies 
Q x for some x G V(K), and by listing the adjacencies between vertices identified by 
and $ y for every x, y G V(K). The core C can now be defined by specifying the 
length of the path corresponding to each edge of K, while the giant component M 
can be defined by specifying the random rooted trees hanging on the vertices of C 
using Theorem 16.101 (which relies in part on Theorem 15 .4j) . □ 

The bound (|23|) is optimal up to a constant factor. This follows from the fact that 
whp the giant component M has a vertex v adjacent to at least (1 — e) logn/ log log n 
leaves. (Indeed, consider the graph M' ^ M that is obtained from M by attaching 
an extra leaf at v.) We believe that the lower bound is sharp, that is, whp D(M) = 
(1 + o(l)) logn/ log logn, but we were not able to settle this question. 

Finally, we consider edge probabilities p = n~ a with rational a G (0, 1). Such p 
occur as threshold functions for (no n-) appearance of particular graphs as induced 
subgraphs in G n ^ p . What is relevant to our subject is that such p show an irregular 
behavior of G U)P with respect to first-order properties. 

Since the treatment of the general case of rational a would require a considerable 
amount of technical work, the paper [50] focuses on a sample value a = 1/4, when 
D(G n>p ) falls down and becomes so small as it is essentially possible (cf. Section [7]). 

Theorem 6.15 (Kim et al. [SO]). If p = n~ 1//4 ; then whp 



Sketch of Proof. The upper bound is based on the following ideas. Let the predicate 
C(xi, X2, X3, £4) state that these 4 distinct vertices have no common neighbor. Its 
probability is (1 — p) n ~ A = e" 1 + o(l) and its values over different 4-tuples are 




(23) 



log* n - log* log* n - 1 < D(G n , p ) < D 3 (G n , p ) < log* n + 0(1). 
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rather weakly correlated. Thus, if for a set A and a vertex v ^ A, we define 
H V (A) be the 3-uniform hypergraph on A with xi,X2,x% e A being a hyperedge 
if and only if C(v, x\, X2, X3) holds, then H V (A) behaves somewhat like a random 
hypergraph. As it is shown in [50j Lemma 21], one can find 4 vertices such that their 
common neighborhood A is relatively large (namely, \A\ = |_ln ' 3 rzj ) and yet there 
are vertices a,m such that hypergraphs H a (A) and H m (A) encode in some way the 
multiplication and addition tables for an initial interval of integers. Also, any integer 
can be succinctly defined in first-order logic with arithmetic operations. Roughly 
speaking, in order to define an integer j, one can write it in binary j = bk . . . b% and 
specify for every i < k the i-th bit crucially, the same binary expansion trick 
can be used recursively to specify the index i, and so on. This allows us to identify 
vertices A with very small depth. Next, we consider the set B of vertices of G that 
have exactly 4 neighbors in A and are uniquely determined by this. Again, the 
vertices of B are easy to identify (just list the 4 neighbors in A). Finally, if A was 
chosen carefully, then each vertex w of G is uniquely identified by the hypergraph 
H W (B). (The reason that we need an intermediate set B is that the number of 

possible 3-uniform hypergraphs H W (A) is at most 2^ 3 ) < n — \A\, that is, too 
small.) Of course, many technical difficulties arise when one tries to realize this 
approach. 

The lower bound in Theorem l6.15l is very general. We use only the simple fact that 
any particular unlabeled graph with m edges is the value of G n>p , where p = n~ 1/4 , 
with probability at most 

n\p m (l-p)^)- m <n\(l-pp) <exp(-(l/2-o(l))n 7 / 4 ) . 

Let F(k) be the number of non-isomorphic graphs definable with depth at most 
k. Then F[D(G) < k] < F(fc) exp {-(1/2 - o(l))n 7 / 4 }. By Theorem [Z3J F(k) < 
Tower(k + 2 + log* k). If k = log* n — log* log* n — 2, we have F(k) < 2 n and hence 
F[D(G) <k]= o(l). • ^ • ^ 

The above idea (arithmetization of certain vertex sets in graphs) has been previ- 
ously used by Spencer J72J Section 8] to obtain non- convergence and non-separability 
results on the example of G n , p with p = n _1//3 . 

So far we have considered the evolution of the logical complexity of a random 
graph in the standard logic with no counting. We conclude this section with an 
extension of Theorem 16.11 

Theorem 6.16 (Czajka and Pandurangan [Hj). Let p(n) be any function of n such 
that ^ l °f* < p(n) < 1/2 where uj{n) 00 as n — >■ 00. Then 2 color refinements 
split a random graph G n ^ v into color classes which are singletons with probability that 
is higher than 1 —n~ c for each constant c > and all large enough n. Consequently, 
D^(G rhP ) < 4 with this probability. 

Note that, in the case of p = 1/2, this result improves the probability bound in 
Part 1 of Theorem I6.1[ while the probability bound in Part 2 is still better. 
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By elaborating on the argument of Lemma 16.21 we are able to supply Theorem 
16.161 with the matching lower bound (that is, D^(G ntP ) > 4 whp) within the range 

A^J\ognjn < p < 1/2. 

7. Best-case bounds: Succinct definitions 

As in the preceding sections, we consider the logical depth of graphs with a given 
number of vertices n. We know that the maximum value D(G) = n + 1 is attained 
by the complete and empty graphs (and only by them) and that the typical values 
lie around logn (see Theorem 16 .4p . Now we are going to look at the minimum. We 
already have a good starting point: By Theorem I6.15[ there are graphs with 

D 3 (G) < log* n + 0(1). 

In order to get such examples, we have to generate a random graph with the edge 
probability n~ 1/4 . In Section 17.11 we give three explicit constructions achieving 
the same bound. In Section 17.21 we introduce the succinctness function q(n) = 
min {D(G) : v(G) = n} and give an account of what is known about it. Section I7T31 
is devoted to the question of how succinctly we can define graphs if we are not al- 
lowed to make quantifier alternations. In Section [73 the bounds on the succinctness 
function are applied to proving separations results for logical parameters of graphs, 
in particular, for D(G) and L(G). 

7.1. Three constructions. 

7.1.1. First method: Padding. We describe a "padding" operation that was invented 
by Joel Spencer (unpublished). It converts any graph G to an exponentially larger 
graph G* with the logical depth larger just by 1. G* includes G as an induced 
subgraph. In addition, for every subset X of V = V(G), the graph G* contains a 
vertex vx- Denote the set of these vertices by V . There is no edge inside V but 
there are some edges between V and V . Specifically, v € V is adjacent to vx iff 
v G X. In particular, v% is isolated and N(vy) = V . 

Vertex Vy will play a special role in our first-order definition of G*. First of all, 
we will say that there is a vertex c (assuming c = vy) whose neighborhood spans 
in G* a subgraph isomorphic to G. This can be done by relativizing a formula $g 
defining G to iV(c). That is, each universal quantification Vx( 1 I r ) in $g has to be 
modified to 

V, e v( c )W=Vx(x~c^^) 
and each existential quantification to 

3, e v( c )W = 3x(x~cA^). 

Denote the relativized version of $g by $g|at(c)- Note that relativization does not 
change the quantifier depth. A sentence defining G* can now look as follows: 

d f / 

$ G * = 3c($ G \ N{c) AV xmc ){N{x) C N(c)) AV Xl(i tN(c)Vx 2 tN{c)( N ( x i) N(x 2 )) 

Ay xi ^N(c)y y &N(x 1 )^x 2 ^N(c)(N(x 2 ) = iV(xi) \ {y})), 
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where we use harmless shorthands for simple first-order expressions. 
It is easy to see that 

D($ G *) = max {£>($<?), 4} + 1 

and that, if $g is a 3*\/*3*V* -formula (that is, every chain of nested quantifiers is a 
string of this form), then $g* stays in this class as well. Consider now a sequence 
of graphs G k where G\ = P\, the single-vertex graph, and G k +i = (G^)*. Since 
V (G*) = v(G) + 2 v( - G \ we have v(G k ) > Tower(k - 1). It follows that D 3 (G k ) < 
\og*v(G k ) + 3. 

7.1.2. Second method: Unite and conquer. Suppose that we have a set C of n-vertex 
graphs, each of logical depth at most d. Our goal is to construct a much larger set 
C* of graphs with a much larger number of vertices n* and logical depth bounded by 
d + 3. An additional technical condition is that all the graphs have diameter 2. We 
know from Theorem 16.41 that almost all graphs on n vertices have logical depth less 
than log n and it is well known that they have diameter 2. Choosing a sufficiently 
large n, we can start with C being the class of all such graphs. Since almost all graphs 

are asymmetric, we have \C\ = (1 — o(l))2( 2 ). Just for the notational simplicity, we 
prefer that \C\ is even. 

For each S C C such that \S\ = \C\/2, the set C* contains graph 



G s = [J G, 

Ges 

that is, we take the vertex disjoint union of all graphs in S and complement it. For 
convenience, we bound the logical depth of the complement Gs = Ugsc G rather 
than that of G. Given an arbitrary H ^ Gs, we analyze the Ehrenfeucht game on 
the two graphs. 

If H has a connected component of diameter at least 3, Spoiler pebbles vertices u 
and v in H at the distance exactly 3 from one another. For Duplicator's responses 
u' and v' in Gs, either dist{u',v') < 2 or dist(u',v') = oo. In any case, Spoiler 
wins within the next 2 moves. Suppose from now on that all components of H have 
diameter at most 2. This condition allows us to assume that Duplicator respects 
the connectivity relation for otherwise Spoiler wins with one extra move (which will 
be added to the total count of rounds). 

If one of the graphs, Gs or H, has a connected component A non-isomorphic to 
any component of the other graph, Spoiler pebbles a vertex in A. Let B be the 
component of the other graph where Duplicator responds. Starting from the second 
round, Spoiler plays the Ehrenfeucht game on non-isomorphic graphs A and B and 
wins in at most d moves. 

If such a component does not exist, Gs must have a component A with at least 
two isomorphic copies in H . Then in the first two rounds Spoiler pebbles vertices 
in these two. Duplicator is forced at least once to respond in a component B of Gs 
non-isomorphic to A, which is an already familiar configuration. 
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Thus, D(Gs) can be at most 3 larger than the maximum logical depth of graphs 
in C. At the same time Gs has the much larger number of vertices, namely n* = 
n\C\/2. It follows that D{G) < loglogt;(G) for any G in C*. 

Note that any graph in C* is the complement of a disconnected graph and hence 
has diameter 2. This allows us to iterate the construction. Say, for any G G (C*)* 
we get D(G) < log log log v(G) and so on. If we fix the initial class C, the iteration 
procedure gives us graphs with D(G) < 31og*t>(G) + 0(1). This bound is worse 
than in the preceding section but the extra factor of 3 can be eliminated if Spoiler 
plays more smartly (see [62J). 

7.1.3. Third method: Asymmetric trees. The two previous examples were artificially 
constructed with the aim to ensure low quantifier depth. Now we present a natural 
class of graphs admitting succinct definability. 

The radius of a graph G is defined by r(G) = min„ e y( G ) e(v), where e(v) denotes 
the eccentricity of a vertex v. A vertex v is central if e(v) = r(G). Any tree has 
either one or two central vertices (see, e.g., [59] Chapter 4.2]). 

Lemma 7.1. Let X be an asymmetric tree with r(T) > 6. Then D(T) < r(T) + 2. 

Proof. We will design a strategy for Spoiler in the Ehrenfeucht game on X and a 
non-isomorphic graph X'. The reader that took the effort to reconstruct the proof 
of Theorem 15.31 will now definitely benefit. 

We can assume that X and X' have equal diameters (in particular, X' is connected) 
for else Spoiler wins in less than log r(T) +4 moves by Lemma I3~2"l If X" is a non-tree, 
let Spoiler pebble a vertex v' on a cycle in T' . By this move Spoiler forces the game on 
T\v and T'\v', where v is Duplicator's response in X. If v is a leaf, Spoiler wins in two 
moves. Otherwise T\ v is disconnected, while diam(T' \v') < 3 diam(T') < 6r(X). 
Lemma [3.21 applies again and Spoiler wins in less than log r(T) + 6 moves. Assume, 
therefore, that T' is a tree too. 

Call a tree diverging if every vertex w splits it into pairwise non-isomorphic 
branches, where each branch is considered rooted at the respective neighbor of w 
(an isomorphism of rooted trees has to match their roots). Any asymmetric tree is 
obviously diverging. On the other hand, if a tree is diverging, it is either asymmet- 
ric or has a single nontrivial automorphism and the latter transposes two central 
vertices. 

Suppose that T' is diverging. In the first round Spoiler pebbles a central vertex v 
of X and Duplicator responds with a vertex v' in X". As it is easily seen, at least one 
of X \ v and X' \ v' has a branch B non-isomorphic to any branch in the other tree. 
Spoiler restricts further play to B by pebbling its root. Continuing in this fashion, 
that is, each time finding a matchless subbranch, Spoiler forces pebbling two paths 
in X and X' emanating from v and v' respectively. Spoiler wins at latest when the 
path in X reaches a leaf. 

So suppose that X' is not diverging. Let v' be a central vertex of X' and u' be 
a vertex at the maximum possible distance from v' with the property that X' \ v! 
has two isomorphic branches B' and B". Spoiler pebbles the path from v' to u' and 
the two neighbors of u' in B' and B". From this point Spoiler can play as before 
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because B' and B" are diverging and only one of them can be isomorphic to the 
corresponding branch pebbled by Duplicator in T. □ 

Lemma 17.11 shows that asymmetric trees are definable with quantifier depth not 
much larger than their radius. On the other hand, asymmetric trees can grow in 
breadth, having a huge number of vertices. More precisely, there are asymmetric 
trees with v(T) > Tower(r(T) — 1). Indeed, let r k denote the number of asymmetric 
rooted trees of height at most k. A simple recurrence 

r = 1, r k = 2 r ^ 

shows that Tf. = Tower (k). Let k > 3 and T k be the (unrooted) tree of radius k with 
a single central vertex c such that the set of branches growing from c consists of all 
Tk-i pairwise non-isomorphic asymmetric rooted trees of height less than k. (The 
reader will now surely recognize another instance of the unite-and-conquer method!) 
Since T k has even diameter, the central vertex c is fixed under all automorphisms. It 
easily follows that is asymmetric. This graph will be referred to as the universal 
asymmetric tree of radius k. Note that v(T k ) > r^i + l > Tower(k-l). Combining 
it with Lemma [7.1[ we obtain D{T^) < k + 2 < log* v(T k ) + 2. 

With a little extra work, trees with low logical depth can be constructed on any 
given number of vertices. It turns out that the log-star bound is essentially the best 
what can be achieved for trees. 

Theorem 7.2 (Pikhurko, Spencer, and Verbitsky [51]) ■ For every n there is a tree 
T on n vertices with D(T) < log* n + 4. On the other hand, for all trees T on n 
vertices we have D(T) > log* n — log* log* n — 4. 

We will see in the next section that the lower bound of Theorem 17.21 cannot be 
extended to the class of all graphs. 

Universal asymmetric trees have been proved to be a useful technical tool in 
complexity theory and finite model theory since a long time, see the references in 
Dawar et al. [2T]. Lemma 3.4(e) in the latter paper readily implies a succinctness 
result for the logical length. 

Theorem 7.3 (Dawar et al. [2TJ). For the universal asymmetric tree of radius k we 
haveL{T k ) = 0((log* ^(T fe )) 4 ). 

The theorem shows that, for infinitely many n, there is a tree T on n vertices with 
L(T) = 0((log*n) 4 ). Unlike Theorem 17. 2\ this result cannot be extended to all n 
because there are infinitely many n such that all graphs on n vertices have logical 

length Q (i^g^) (see ([30} in the proof of Theorem EH • 

7.2. The succinctness function. Define the succinctness function by 

q(n) = min {D(G) : v{G) = n} . 

Since only finitely many graphs are definable with a fixed quantifier depth (see 
Theorem I2.3[) . we have q(n) — > oo as n — > oo. The examples collected in Section 
17.11 show that q(n) increases rather slowly. Let q a {n) denote the version of q(n) for 
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definitions with at most a quantifier alternations. The padding construction from 
Section [?. 1.11 gives us 

q 3 (n) < log* n + 3 (24) 

for infinitely many n and, by Theorem 16.151 this bound holds actually for all n, 
perhaps with a worst additive constant. 

Is the log-star bound best possible? The answer is surprising enough: in some 
strong sense it is but, at the same time, it is very far from being tight. First, let us 
elaborate on the latter claim. 

A prenex formula is a formula with all its quantifiers being in front. In this case 
there is a single sequence of nested quantifiers and the quantifier rank is just the 
number of quantifiers occurring in a formula. The superscript prenex will mean that 
we allow defining sentences only in prenex form. Thus, qP renex (n) is equal to the 
minimum quantifier depth of a prenex formula with at most a quantifier alternations 
that defines a graph on n vertices. We obviously have D(G) < D a (G) < DP renex (G). 
Recall that L a (G) denotes the minimum length of a sentence defining G with at 
most a quantifier alternations. Since a quantifier-free formulas with k variables is 
equivalent to a disjunctive normal form over 2( 2 ) relations between the variables, 
we obtain also relation 

L a (G) = 0(h(DP renex (G))) where h(k) = k 2 2 k \ (25) 

Recall that a total recursive function is an everywhere defined recursive function. 

Theorem 7.4 (Pikhurko, Spencer, and Verbitsky [HI])- There is no total recursive 
function f such that j '(g| renea ; (n)) > n for all n. 

The theorem implies a superrecursive gap between v(G) and D${G) or even L$(G). 
In particular, the values of 53 (n) are infinitely often inconceivably smaller even than 
the values of log* n. More generally, if a total recursive function l(n) is monotone 
nondecreasing and tends to infinity, then 

q(n) < l{n) for infinitely many n, (26) 

which actually means that the succinctness function admits no reasonable lower 
bound. 

The proof of Theorem 17.41 is based on simulation of a Turing machine M by a 
prenex formula $a/ in which a computation of M determines a graph satisfying <3>m 
and vice versa. Such techniques were developed in the classical research on Hilbert's 
Entscheidungsproblem by Turing, Trakhtenbrot, Biichi and other researchers (see 
[T4"j for survey and references). An important feature of our simulation is that it 
works if we restrict the class of structures to graphs. As a by-product, we obtain 
another proof of Lavrov's version of the Trakhtenbrot theorem [52] (see also [26, 
Theorem 3.3.3]) saying that the first-order theory of finite graphs is undecidable. 
The proof actually shows the undecidability of the V*3 p V s 3*-fragment of this theory 
for some p, s, and t. 

We now have to explain why bound (124"|) . though not sharp, is best possible in 
some sense. Let us define the smoothed succinctness function q*(n) to be the least 
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monotone nondecreasing integer function bounding q(n) from above, that is, 

q*(n) = maxg(m). (27) 

rn<n 

The following theorem shows that q*(n) = (1 + o(l))log*n and, therefore, the 
log-star function is a nearly optimal monotone upper bound for the succinctness 
function q(n). 

Theorem 7.5 (Pikhurko, Spencer, and Verbitsky [BT]). 

log* n — log* log* n — 2 < q* (n) < log* n + 4. 

Though the lower bound contains a nonconstant lower order term, it can hardly 
be distinguished from a constant: for example, log* log* n = 3 for n = 10 80 , which is 
a rough estimate of the number of elementary particles in the observable universe. 

Proof. Theorem 17.21 implies that q(n) < log* n + 4 for all n. Since this bound 
is monotone, it is a bound on q*(n) as well. The lower bound for q*(n) can be 
derived from Theorem 12.31 According to it, at most Tower (k + log* k + 2) graphs 
are definable with quantifier depth k. Given n > Tower(3), let k be such that 
Towerik + 2 + log* k) < n < Towerik + 3 + log*(fc + 1)). It follows that k > 
log* n — log* log* n — 4. By the Pigeonhole Principle, there will be some m <n for 
which no graph of order precisely m is defined with quantifier depth at most k. We 
conclude that q*(n) > q(m) > k and hence q*(n) > log* n — log* log* n — 2. □ 

We defined q*(n) to be the "closest" to q{n) monotone function. Notice that q(n) 
itself lacks the monotonicity, deviating from q*{n) infinitely often (set l{n) to be the 
lower bound in Theorem 17.51 and apply ( 12T)|) ) . 

7.3. Definitions with no quantifier alternation. It is interesting to observe how 
the succinctness function changes when we put restrictions on the logic. Note that 
all what we have stated about the succinctness function for first-order logic actually 
holds true for its fragment with 3 quantifier alternations. Now we consider the 
first-order logic with no quantifier alternation, consisting of purely existential and 
purely universal formulas and their monotone Boolean combinations (of course, all 
negations are supposed to stay in front of relation symbols). It is easy to see that any 
sentence with no quantifier alternation is equivalent to a sentence in the Bernays- 
Schonfinkel class. The latter consists of prenex formulas in which the existential 
quantifiers all precede the universal quantifiers, as in 

$ = 3x x . . . 3x k \/ yi . . . VyiV(x } y), (28) 

where \1/ is quantifier-free. This fragment of first-order logic is provably weak. 

To substantiate this claim, consider the finite satisfiability problem: Given a first- 
order sentence $ about graphs, one has to decide whether or not there is a finite 
graph satisfying $. More generally, let Spectrum($) consist of all those n such that 
there is a graph on n vertices satisfying $. Thus, the problem is to decide whether 
Spectrum($) is nonempty. 
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Lavrov [52] proved that this problem is unsolvable even for sentences without 
equality (for directed graphs this is a classical result on Hilbert's Entscheidungsprob- 
lem, known as the Trakhtenbrot-Vaught theorem, see |14j). However, if we consider 
only sentences in the Bernays-Schonfinkel class, the finite satisfiability problem be- 
comes decidable. This directly follows from the following simple observation showing 
that a nonempty spectrum always contains a certain small number. 

Lemma 7.6. Suppose that a first-order sentence <3> is of the form (ESjj. If $ is 

satisfiable, then Spectrum($) contains k or a smaller number. 

Proof. Assume that $ is true on a graph G with more than k vertices and let 
U C V(G) be the set of vertices Xi,...,Xk whose existence is claimed by $. Note 
that the induced subgraph G[U] satisfies $ as well. □ 

The solvability of the finite satisfiability problem for the Bernays-Schonfmkel class 
was observed by Ramsey in [69]. Ramsey showed that the spectrum of a Bernays- 
Schonfmkel formula can be completely determined. This follows from the following 
result where his famous combinatorial theorem appeared as a technical tool. Recall 
that a set is co finite if it has finite complement. 

Theorem 7.7 (Ramsey [69J). Any sentence about graphs $ in the Bernays- 
Schdnfinkel class has either finite or cofinite spectrum. More specifically, if $ is 
of the form $EB) , then either Spectrum($) contains no number equal to or greater 
than 2 k A l or it contains all numbers starting from k + I. 

Proof. Assume that $ is true on a graph G with at least 2 k A l vertices and let 
U C V{G) consist of vertices x\, . . . ,Xk whose existence is claimed by $. Recall 
that Ramsey number R(l) is equal to the minimum R such that every graph with 
R or more vertices contains a homogeneous set of I vertices. As it is well known, 
R(l) < A 1 . By the Pigeonhole Principle, V(G) \ U contains a subset W of R(l) 
vertices with the same neighborhood within U. Let X be a homogeneous set of I 
vertices in G[W]. Note that G[U U X] satisfies $ and that X is a set of I twins in 
this graph. Cloning the twins, we can obtain a graph that satisfies $ and has any 
number of vertices larger than k + I. □ 

After this small historical excursion, let us turn back to the definability with no 
quantifier alternation. First of all, note that even without quantifier alternation all 
graphs remain definable (see (jSJ) and, hence, the parameter D (G) is well defined. 

Theorem 7.8 (Pikhurko, Spencer, and Verbitsky [61J). Dq(G) is a computable 
parameter of a graph. 

Proof. Given m > 0, one can algorithmically construct a finite set U m consisting of 
0-alternating sentences of quantifier depth m so that every 0-alternating sentence 
of quantifier depth m has an equivalent in U m . To decide if Dq(G) < m, for each 
sentence T e U m satisfied by G we have to check if T can be satisfied by another 
graph G' . We first reduce T to an equivalent statement ^ in the Bernays-Schonfmkel 
class. Suppose that \& has k existential quantifiers. It suffices to test all G' ^ G with 
at most k + l vertices. Indeed, if $ is true on a graph with more than k + 1 vertices 
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then, by the argument used to prove Lemma 17.61 $ is as well true on its induced 
subgraphs with k + 1 and k vertices (one of which is not isomorphic to G). □ 

We cannot prove anything similar for D(G) or even Di(G). The proof of Theorem 
17.81 is essentially based on the decidability of whether or not a O-alternating sentence 
is defining for some graph. However, in general this problem is undecidable (see |61j). 

For the logic with no quantifier alternation, the succinctness function has much 
more regular behavior. 

Theorem 7.9 (Spencer, Pikhurko, and Verbitsky [62J). 

log* n — log* log* n — 2 < q (n) < log* n + 22. 

The lower bound has to be contrasted to Theorem 17.41 It gives us a kind of a 
quantitative confirmation of the fact that the O-alternation fragment of first-order 
logic is strictly less powerful. The upper bound improves upon the alternation 
number in (El]) attaining the optimum. The proof of this bound is based on the unite- 
and-conquer construction in Section I7.1.2[ where more subtle analysis is needed in 
order to achieve the zero alternation number. All the details can be found in 1621. 



Proof of Theorem \ 7. 9\ (lower bound). Given n, denote k = qo(n) and fix a graph G 



on n vertices such that Dq{G) = k. The same relation between L a (G) and D a {G) 
as in Theorem 12.21 is proved in [62J. By this result, G is definable by a O-alternating 
sentence T of length less than Tovjerik + log* k + 2). Convert T to an equivalent 
sentence $ in the Bernays-Schonfinkel class and note that -D($) < L(T). By Lemma 
17.61 $ must be true on some graph with at most -D($) vertices. Since $ is true only 
on G, we have 

n < £>($) < L(T) < Towerik + log* k + 2). 

This implies that 

log* n < k + log* k + 2. (29) 
Suppose on the contrary to our claim that k < log* n — log* log* n — 3. Then 
log* k < log* log* n and ( |29l) implies that 

log* n < (log* n — log* log* n — 3) + log* log* n + 2, 

which is a contradiction, proving the claimed bound. □ 

Using the lower bound of Theorem 17.91 and the absence of any recursive link- 
age between q^{n) and n, we are able to show a superrecursive gap between two 
parameters in the logical depth hierarchy 

D{G) < D 3 (G) < D 2 (G) < D X (G) < D Q (G). 

Theorem 7.10 (Pikhurko, Spencer, and Verbitsky [61]). There is no total recursive 
function f such that D (G) < f(D 3 (G)) for all graphs G. 

Proof. Assume that such an / exists. Let G n be a graph for which D 3 {G n ) = q 3 {n). 
Then 

f(q 3 (n)) = f(D 3 (G n )) > D (G n ) > q (n) > log* n - log* log* n-2. 
This implies that Tower(2f(q 3 (n))) > n, contradictory to Theorem 17.41 □ 
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We have seen weighty evidences that the O-alternating sentences are strictly less 
expressive than the sentences of the same quantifier depth with quantifier alterna- 
tions. It is quite surprising that, nevertheless, sometimes we can prove for D Q (G) 
upper bounds which are just a little worse than the best known bounds for D(G). 
The following results should be compared with Theorems I5.4[ 15. 8[ and 16.41 

Theorem 7.11. 

1. (Bohman et al. [12]) Let D (n,d) denote the maximum of D (T) over all 
trees with n vertices and maximum degree at most d = d(n) . If both d and 
log n/ log d tend to infinity, then D (n,d) < (l + o(l))^^. 

2. (Pikhurko, Veith, and Verbitsky [63]) D Q (G, H) < for all non-isomorphic 
graphs G and H with the same number of vertices n. 

3. (Kim et al. [50]) D (G n l/2 ) < (2 + o(l)) logn with high probability. 

We conclude this subsection with a demonstration of somewhat surprising strength 
of the Bernays-Schonfinkel class. We say that a sentence $ identifies a graph G if it 
distinguishes G from any non-isomorphic graph of the same order. Let BS(G) denote 
the minimum quantifier depth of $ in the Bernays-Schonfinkel class identifying G. 
We already discussed the identification problem in Section 15.2.11 Note, however, 
a striking difference. While in Section 15.2.11 we could make the conjunction of all 
sentences distinguishing G from another graph H of the same order, now we 
have to distinguish G from all such if by a single prenex sentence! 

Theorem 7.12 (Pikhurko and Verbitsky [65J). 

1. For any graph G of order n, we have BS(G) < | n + |. 

2. With high probability we have BS(G n _ 1/2 ) < (2 + o(l))logn. Moreover, the 
latter bound holds true even if the number of universal quantifiers in an 
identifying formula is restricted to 2. 

7.4. Applications: Inevitability of the tower function. Succinctly definable 
graphs can be used to show that the tower function is sometimes unavoidable in 
relations between logical parameters of graphs. We first observe that the relationship 
between the logical depth and the logical length in Theorem 12.21 is "nearly" tight. 

Theorem 7.13 (Pikhurko, Spencer, and Verbitsky [HI])0 There are infinitely many 
pairwise non-isomorphic graphs G with L(G) > Tower(D(G) — 7). 

Proof. The proof is given by a simple counting argument. A first-order sentence $ 
defining a graph G determines a natural binary encoding of G (up to isomorphism) 
of length 0(L(<3>) logL($)). It follows that at most m = 2°( fclogfc ) graphs can have 
logical length less than k. By the Pigeonhole Principle, there is n < m + 1 such that 
L(G) > k for all G on n vertices. For all these graphs we have 

L(G) = a (^-) , (30) 
\ log log n J 

7 In [ST] we stated a better bound L(G) > Tower (D(G) — 6) — O(l), which was proved for the 
variant of L(G) where variable xi contributes logi, rather than just 1, to the formula length. 
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which exceeds log log n if k is chosen sufficiently large. By Theorem 17.21 there is a 
graph G n on n vertices with 

D(G n ) < log* n + 5. (31) 

Combining fl3T|) and fl30|) . we obtain the desired separation of L(G n ) from D{G n ). 
Increasing the parameter k, we can have infinitely many such examples. □ 

One of the consequences of Theorem [73] is that prenex formulas are sometimes un- 
expectedly efficient in defining a graph. We are now able to show that, nevertheless, 
they generally cannot be competitive against defining formulas with no restriction 
on structure. More specifically, we have simple relations 

D(G) < D prenex (G) < L(G) < L prenex (G). (32) 

Combining the second inequality with Theorem 12.21 we obtain 

DP renex (G) < Tower (D(G) + log* D(G) + 2) 

and we can now see that this relationship between jjP renex [Q^ anc l D(G) is not so 
far from being optimal. 

Corollary 7.14. There are infinitely many pairwise non-isomorphic graphs G with 
DP renex (G) > Tower(D(G) -8). 

The proof of Theorem 17.131 gives us actually a better bound, though somewhat 
cumbersome, namely L(G) > T/(c log T) with T = Tower(D(G) — 6) and c a 
constant. Corollary 17.141 follows from here simply by noticing that parameters 
jjprenex^Q^ an( j L(G) are exponentially close. The latter fact follows from ( 132]) 
and a version of (I25p . namely 

L(G) = 0(h(D prenex (G))) where h(x) = x 2 2 x " . 

In conclusion we note that the tower function is essential also in the upper bound 
for the number of graphs definable with quantifier depth k given by Theorem 12.31 

Corollary 7.15. There are at least (1 — o(l)) Towerik — 2) first-order sentences of 
quantifier depth k defining pairwise non-isomorphic graphs and, hence, being pair- 
wise inequivalent. 

Proof. In Section [7. 1.31 we noticed that there are exactly = Tower(h) asymmetric 
rooted trees of height at most h. Basically this follows from the fact that such a 
tree is completely characterized by the set of its branches from the root, each being 
an asymmetric rooted tree of height at most h — 1 (the root is not a part of any 
branch). Thus, r% — r^-i asymmetric rooted trees have height exactly h. Note that 
(r7j-i — Th-2)fh-i °f them have exactly one branch of height h — 1. Therefore, there 
are at least — Th-x ~ (fh-i — ^-2)^/1-1 = (1 — o(l)) Tower(h) asymmetric rooted 
trees whose underlying trees (with roots dismissed) have diameter 2h and, hence, 
are asymmetric too. By Lemma 17. 1[ each of these trees is definable with quantifier 
depth h + 2. □ 

A lower bound of Towerik — 2) for the number of pairwise inequivalent sentences 
of quantifier depth k is shown by Spencer [721 Theorem 2.2.2]. 
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8. Open problems 

Many questions remain open, some of which are included in the main text of the 
survey alongside the known related results. For reader's convenience we collect a 
few open problems here that we consider most interesting. 

Tomasz Luczak (Conference on Random Structures and Algorithms, Poznah, 
2003) asked if D(G), or W(G), is a computable function of the input graph G. 

While the factor of 1/2 in Theorem 15.81 is best possible, we do not know if it can 
be improved for logic with counting. Surprisingly, we could not resolve even the 
following question. Is there e > such that for every graph G of sufficiently large 
order n we have W#(G) < (| — e)n? 

Recall that no sublinear bound is generally possible here because Cai, Fiirer, 
and Immerman [15] constructed graphs with linear width in the counting logic; see 
Theorem 15.71 Automorphisms of these graphs play an essential role in establishing 
this lower bound. It would be very interesting to estimate W#(G) from above for 
asymmetric G. Again, we have only the bound D#(G) < (n + 3)/2 as a straightfor- 
ward corollary of Theorem 15.81 where no restriction on the automorphism group is 
supposed. 

Another research direction, with applications to the graph isomorphism problem, 
is identification of natural classes of graphs with W#(G) bounded by a constant; 
see Sections 15.1.41 and 15.1.51 Such a bound is known for interval graphs J28J [5TJ , 
and it is interesting if it can be extended to the class of circular-arc graphs. The 
approach suggested in [51] is based on the fact that any maximal clique in an interval 
graph is definable as the common neighborhood of some two vertices. This prevents 
any straightforward extension to circular-arc graphs, where the number of maximal 
cliques can be exponential. The (un)boundedness of W#(G) is an interesting open 
question also for disk graphs, yet another extension of the class of interval graphs 
(Martin Grohe, 2010). 

A result of Dawar, Lindell, and Weinstein [20] (see also Theorem 14. T[) implies an 
upper bound for D # (G) in terms of W#(G) and the order n of G, where W#(G) 
disappointedly occurs at the exponent. Can this bound be improved? At the mo- 
ment we cannot even exclude that D^{G) = 0(W#(G) logn). If the latter bound 
was true for D^{G) with k = 0(W#(G)), this would have important consequences 
for isomorphism testing by Theorem 14.51 

Where do we need the power of counting quantifiers? To keep far away from the 
trivial example of a complete or empty graph, suppose that a graph G is asymmetric. 
Is it true or not that W(G) = 0(W#(G) logn)? A random graph shows that this 
bound would be best possible. 

We are still far from having a complete evolutionary picture of the logical com- 
plexity for a random graph. Let 5 G (0, 1) be fixed and p be an arbitrary function 
of n with n~ 5 < p < ~. Is it true that whp D(G n>p ) = O(logra)? 

The local behavior of the succinctness function q(n), that was defined in Sec- 
tion [7J21 is unclear. While it is trivial that q(n + 1) < q(n) + 1, we do not know, for 
example, if q(n + 1) > q(n) — C for some constant C and all n. 
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In accordance with our notation system, let q k (n) denote the succinctness function 
for the k- variable logic. By slightly modifying the proof of Lemma I7TT) one can show 
that q 3 (n) < (1 + o(l)) log* n for all n. Since the satisfiability problem for the 3- 
variable logic is undecidable (see, e.g., [M]), it is not excluded that an analog of 
Theorem 17.41 can be established for q 3 (n). 

Given a fixed k, how far apart from one another can the values of D(G) and 
D k (G) < oo be? 

Theorem 17.101 says that there is no recursive link between D 3 (G) and D (G). Can 
one show a superrecursive gap between D a {G) and D b {G) for some b > a > or, at 
least, between D{G) and Di(G)l 

Though the case of trees was thoroughly investigated throughout the survey, this 
class of graphs deserves further attention. One may expect that many logical ques- 
tions for trees are easier. Note in this respect that the first-order theory of finite 
trees is decidable due to Rabin [67]. Nevertheless, we do not know, for example, 
whether or not the logical depth D(T) of a tree T is a computable parameter (while 
it is not hard to show that the logical width W(T) is computable in logarithmic 
space). 

Disappointingly, we were able to collect only a few results on the logical length for 
this survey. From the fact that there are 2 ( - 1 ^ 2+ °^ n non-isomorphic graphs of order 



n, it is easy to derive that whp L(G n l/2 ) = Q yj^J- The obvious general upper 

bound is 0(n 2 ). This leaves open the question what the logical length of a typical 
graph is. Also, it would be very interesting to find explicit examples of graphs with 
large L(G). Pseudo-random graphs can be natural candidates. For example, it is 
well known (Blass, Exoo, and Harary [TO]) that Paley graphs share the first-order 
properties of a truly random graph. 

Furthermore, we can define the succinctness function with respect to the logical 
length by s(n) = min {L(G) : v (G) = n}. Let s a (n) be the version of s(n) for the a- 
alternation logic. From Theorem 17.41 and the relation fl25|) . it follows that s(n), and 
even s^(n), can be incomprehensibly smaller than n: for any total recursive function 
/ we must have / (53(77,)) < n infinitely often. On the other hand, the estimate fl3H|) 

in the proof of Theorem 17.131 implies that s(n) = Q ( lo lo ^ n J for infinitely many 



n. Moreover, the same argument shows that s*(n) = Q ( . . gn ) for all n, where 



s*(n) denotes the smoothed version of s(n) similarly to (127)) . How tight is the bound 



the function so(n) (recall that for qo(n) we know the exact asymptotics owing to 
Theorem 17.91) . Note in conclusion that techniques for estimating the length of a 
first-order formula are worked out, e.g., by Adler and Immerman p], Dawar et al. 
[21], Grohe and Schweikardt [4Tj . 






Another interesting problem is the behavior of 
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